Skip to content

2.25.2.0-b163

@fizaaluthra fizaaluthra tagged this 14 Mar 14:10
Summary:
## Background:

Commit 9c8ae9628fa5ab401d5958074729d1fab173b55f / D40946 introduced changes that diverge from vanilla PG behaviour. Specifically, `relfilenode` and `relam` are set for parent partitioned tables (they are both 0 in vanilla PG). `RelationInitTableAccessMethod(...)` is now executed for parent partitioned tables as well.

This change in `relam` causes issues with upgrade from 2.25.0 to 2.25.1. This is because 2.25.0 doesn't have the commit, so `relam` for the parent is 0. Therefore, after an upgrade, calls to `RelationInitTableAccessMethod(...)` for the parent partitioned table fail with `cache lookup failed for access method 0`.

## Solution:
Re-work the fix in 9c8ae9628fa5ab401d5958074729d1fab173b55f / D40946: we don't necessarily need to set `relam` for the parent to perform a rewrite on it.
Revert the changes for setting the `relam`. Instead of executing the entire `ATRewriteTable` flow (which requires an access method to be set for the relation), simply execute `RelationSetNewRelfilenode` (get new relfilenode and update the pg_class value, create DocDB table, drop the old one) followed by `reindex_relation`.
Jira: DB-15669

Test Plan:
./yb_build.sh release --cxx-test pg15_upgrade-test --gtest_filter Pg15UpgradeTest.PartitionedTableRewrite
./yb_build.sh --cxx-test pgwrapper_pg_ddl_atomicity-test --gtest_filter PgDdlAtomicityTest.TestPartitionedTableSchemaVerification
./yb_build.sh --cxx-test yb-backup-cross-feature-test --gtest-filter YBBackupTest.TestPartitionedTableRewrite

New test: PartitionedTableUpgradeTest.SimplePartitionedTable
./yb_build.sh --cxx-test integration-tests_basic_upgrade-test --gtest_filter PartitionedTableUpgradeTest.SimplePartitionedTable

Fails on master with:
```
TRAP: FailedAssertion("relation->rd_rel->relam != InvalidOid", File: "../../../../../../../src/postgres/src/backend/utils/cache/relcache.c", Line: 3871, PID: 35093)
```

Manual test:

Create a 2.25.0 cluster and execute the following:

```
yugabyte=# CREATE TABLE orders (
yugabyte(#         order_id SERIAL,
yugabyte(#         order_date DATE NOT NULL,
yugabyte(#         customer_name VARCHAR(255),
yugabyte(#         product_name VARCHAR(255),
yugabyte(#         quantity INT
yugabyte(# ) PARTITION BY RANGE(EXTRACT(YEAR FROM order_date));
CREATE TABLE
yugabyte=# CREATE TABLE orders_2019 PARTITION OF orders FOR VALUES FROM (2019) TO (2020);
CREATE TABLE
yugabyte=# CREATE TABLE orders_2020 PARTITION OF orders FOR VALUES FROM (2020) TO (2021);
CREATE TABLE
yugabyte=# CREATE TABLE orders_2021 PARTITION OF orders FOR VALUES FROM (2021) TO (2022);
CREATE TABLE
```
Then, stop the cluster, restart it with a 2.25.1 build, and try reconnecting. This fails:
```
$ ./bin/ysqlsh
ysqlsh: error: connection to server at "localhost" (::1), port 5433 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 5433 failed: FATAL:  cache lookup failed for access method 0
```

Now, restart the cluster with the fix, and try reconnecting. This works:
```
$ ./bin/ysqlsh
ysqlsh (15.2-YB-2.25.2.0-b0)
Type "help" for help.

yugabyte=#
```

Reviewers: myang, hsunder

Reviewed By: myang, hsunder

Subscribers: hsunder, smishra, yql

Differential Revision: https://phorge.dev.yugabyte.com/D42492
Assets 2
Loading