Summary:
When running the following commands:
```
CREATE ROLE test;
SELECT * from pg_yb_catalog_version;
ALTER ROLE test CONNECTION LIMIT 1;
SELECT * from pg_yb_catalog_version;
ALTER ROLE test CONNECTION LIMIT -1;
SELECT * from pg_yb_catalog_version;
ALTER ROLE test CONNECTION LIMIT -1;
SELECT * from pg_yb_catalog_version;
```
Currently we see catalog version goes up by 1 for each of the `ALTER ROLE`
statement:
```
yugabyte=# CREATE ROLE test;
CREATE ROLE
yugabyte=# SELECT * from pg_yb_catalog_version;
db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
1 | 1 | 1
4 | 1 | 1
5 | 1 | 1
13515 | 1 | 1
13516 | 1 | 1
(5 rows)
yugabyte=# ALTER ROLE test CONNECTION LIMIT 1;
ALTER ROLE
yugabyte=# SELECT * from pg_yb_catalog_version;
db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
1 | 2 | 2
4 | 2 | 2
5 | 2 | 2
13515 | 2 | 2
13516 | 2 | 2
(5 rows)
yugabyte=# ALTER ROLE test CONNECTION LIMIT -1;
ALTER ROLE
yugabyte=# SELECT * from pg_yb_catalog_version;
db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
1 | 3 | 3
4 | 3 | 3
5 | 3 | 3
13515 | 3 | 3
13516 | 3 | 3
(5 rows)
yugabyte=# ALTER ROLE test CONNECTION LIMIT -1;
ALTER ROLE
yugabyte=# SELECT * from pg_yb_catalog_version;
db_oid | current_version | last_breaking_version
--------+-----------------+-----------------------
1 | 4 | 4
4 | 4 | 4
5 | 4 | 4
13515 | 4 | 4
13516 | 4 | 4
(5 rows)
```
The last `ALTER ROLE` statement does not change anything and is logically a
no-op, there is no need to increment catalog version, which has side effect of
triggering catalog cache refreshes on all active connections of all databases.
Note that `ALTER ROLE` is particularly high impact because it affects all
databases.
I made a change to detect such a no-op `ALTER ROLE` DDL statement so that YSQL
will not issue any catalog writes. Then our existing optimization will work as
expected to avoid catalog version increment.
A new GUC is added `yb_enable_nop_alter_role_optimization` which can be used
to turn off this optimization.
NOTE: There is a semantic change as discussed below:
1. node n1 ALTER ROLE sets connection limit -1 at t1
2. node n2 ALTER ROLE sets connection limit 100 at t2
3. node n1 ALTER ROLE sets connection limit -1 at t3
t1 < t2 < t3
Assuming there is t2 - t1 > heartbeat delay, then n2 will see -1, since 100 !=
-1, n2's ALTER ROLE is not a nop and will write 100.
If t3 - t2 > heartbeat delay, then n1 will see 100 because the new catalog
version causes n1 to refresh its catalog cache and hence n1 will see 100,
since -1 != 100, n1's ALTER ROLE is not a nop and will write -1 as expected.
If t3 - t2 < heartbeat delay, then n1 will not see 100 because the new catalog
version has not propagated yet and n1 will see -1, since -1 == -1, n1's ALTER
ROLE will be a nop. **But before this diff, n1's ALTER ROLE will fail due to a
conflict when n1 tries to write -1 again. This is the semantic change that is
user visible.**
We can argue that because t3 - t2 < heartbeat delay, n2's ALTER ROLE and n1's
ALTER ROLE are **concurrent**, and therefore YubaByte is free to reorder them so
that n1's ALTER ROLE is arranged to be before n2's ALTER ROLE. Then it is very
reasonable for n1's ALTER ROLE to be a nop.
But if customers really need to depend on the old behavior where n1's ALTER ROLE
fails with an error, then they can use the PG gflag
`--ysql_yb_enable_nop_alter_role_optimization=false` to get the old behavior back.
Jira: DB-13301
Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressMisc#testPgRegressMiscIndependent'
Reviewers: kfranz, mihnea
Reviewed By: kfranz
Subscribers: yql
Differential Revision: https://phorge.dev.yugabyte.com/D38900