Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg_appendonly.compresstype, compresslevel and blocksize reset to defaults after ALTER reorganize=true|false #8207

Closed
soumyadeep2007 opened this issue Jul 22, 2019 · 5 comments · Fixed by #13847

Comments

@soumyadeep2007
Copy link
Contributor

soumyadeep2007 commented Jul 22, 2019

Greenplum version or build

master

Expected behavior

ALTER TABLE with reorganize=true|false should not modify the compresstype, compresslevel and blocksize columns of the pg_appendonly table for the table under consideration.

Actual behavior

The compress_type is changed to empty "".

Step to reproduce the behavior

  1. Start with a fresh cluster for simplicity.
  2. Execute the following:
postgres=# create table t (id smallint, id1 smallint, id2 smallint, id3 smallint,id4 smallint) with (appendonly=true,orientation=column,compresstype=rle_type);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# select * from pg_appendonly;
 relid | blocksize | safefswritesize | compresslevel | checksum | compresstype | columnstore | segrelid | blkdirrelid | blkdiridxid | visimaprelid | visimapidxid
-------+-----------+-----------------+---------------+----------+--------------+-------------+----------+-------------+-------------+--------------+--------------
 16384 |     32768 |               0 |             1 | t        | rle_type     | t           |    16386 |           0 |           0 |        16388 |        16390
(1 row)

postgres=# alter table t set with (reorganize=true);
ALTER TABLE
postgres=# select * from pg_appendonly;
 relid | blocksize | safefswritesize | compresslevel | checksum | compresstype | columnstore | segrelid | blkdirrelid | blkdiridxid | visimaprelid | visimapidxid
-------+-----------+-----------------+---------------+----------+--------------+-------------+----------+-------------+-------------+--------------+--------------
 16384 |     32768 |               0 |             0 | t        |              | t           |    16393 |           0 |           0 |        16395 |        16397
(1 row)
@soumyadeep2007
Copy link
Contributor Author

soumyadeep2007 commented Aug 16, 2019

RCA:

  1. The alter table statement involves creation of a temp ao table with a CTAS (whose relfilenode is later switched with the given AO table) inprebuild_temp_table. We empty out the compression settings for the temp table:
if (RelationIsAoCols(rel))
{
	if (useExistingColumnAttributes)
	{
		/* 
		 * Need to remove table level compression settings for the
		 * AOCO case since they're set at the column level.
		 */
		ListCell *lc;

		foreach(lc, opts)
		{
			DefElem *de = lfirst(lc);

			if (de->defname &&
				(strcmp("compresstype", de->defname) == 0 ||
				 strcmp("compresslevel", de->defname) == 0 ||
				 strcmp("blocksize", de->defname) == 0))
				continue;
			else
				cs->options = lappend(cs->options, de);
		}
		col_encs = RelationGetUntransformedAttributeOptions(rel);
	}
	.
	.
  1. These options become part of the CTAS and then lead to an update of the pg_appendonly relation in heap_create_with_catalog() -> InsertAppendOnlyEntry().

The fix is to replace:

if (de->defname &&
				(strcmp("compresstype", de->defname) == 0 ||
				 strcmp("compresslevel", de->defname) == 0 ||
				 strcmp("blocksize", de->defname) == 0))
				continue;
			else
				cs->options = lappend(cs->options, de);

with

cs->options = lappend(cs->options, de);

@soumyadeep2007
Copy link
Contributor Author

soumyadeep2007 commented Aug 16, 2019

Impact:

/* 
* Need to remove table level compression settings for the
* AOCO case since they're set at the column level.
*/

This seems to suggest that these values for pg_appendonly (compresstype, compresslevel and blocksize) at the table level are never used for AOCS tables. Actually verifying that is non-trivial:
On the surface: These values are used a bunch of places (part of many structs) without an immediately surrounding guard for RelationIsAoCols. Also, they are not nulled out when an AOCS table is created.

Even if no code-path actually reads these values, we might, in the future, mistakenly or otherwise use them.

Thus, we should either:

  1. Apply the fix mentioned above to avoid inconsistency.
  2. Explicitly ensure that these attributes are nulled out for AOCS tables (e.g. when they are created) and introduce sanity checks to cover these.

@soumyadeep2007 soumyadeep2007 added this to the gpdb-6 milestone Aug 16, 2019
@soumyadeep2007 soumyadeep2007 changed the title pg_appendonly.compress_type nulled out after ALTER reorganize=true|false pg_appendonly.compresstype, compresslevel and blocksize reset to defaults after ALTER reorganize=true|false Aug 19, 2019
@ashwinstar
Copy link
Contributor

Just reflecting the reality that this continues to be the issue and is still reproducible.

@huansong
Copy link
Contributor

huansong commented Jul 26, 2022

According to GPDB doc, the table-level options still take effect on AOCO tables, but just the column-level options will take precedence:https://docs.vmware.com/en/VMware-Tanzu-Greenplum/6/greenplum-database/GUID-admin_guide-ddl-ddl-storage.html#precedence-of-compression-settings-11

And, at least from the metadata perspective, GPDB behaves so too:

postgres=# create table t(a int, b int, column a encoding (compresstype=zlib))
with (appendonly=true, orientation=column, compresstype=rle_type);
CREATE TABLE
postgres=# select * from pg_attribute_encoding;
 attrelid | attnum |                       attoptions
----------+--------+---------------------------------------------------------
    35569 |      1 | {compresstype=zlib,compresslevel=1,blocksize=32768}
    35569 |      2 | {compresstype=rle_type,compresslevel=1,blocksize=32768}
(2 rows)

So if no column-level option is given for a certain column, pg_attribute_encoding will fetch the table-level option for it (column b above).

And if we add new column w/o specifying column-level options, it will also take the table-level options:

postgres=# alter table t add column c int;
ALTER TABLE
postgres=# select * from pg_attribute_encoding;
 attrelid | attnum |                       attoptions
----------+--------+---------------------------------------------------------
    35569 |      1 | {compresstype=zlib,compresslevel=1,blocksize=32768}
    35569 |      2 | {compresstype=rle_type,compresslevel=1,blocksize=32768}
    35569 |      3 | {compresstype=rle_type,blocksize=32768,compresslevel=1}
(3 rows)

Now back to the current issue. I found that after reorganize, pg_attribute_encoding still retains the column-level options, and pg_class.reloptions still retains the table-level options. It is only the pg_appendonly that's inconsistent with prior-reorganize:

postgres=# alter table t set with (reorganize=true);
ALTER TABLE
postgres=# select * from pg_attribute_encoding;
 attrelid | attnum |                       attoptions
----------+--------+---------------------------------------------------------
    35569 |      1 | {compresstype=zlib,compresslevel=1,blocksize=32768}
    35569 |      2 | {compresstype=rle_type,compresslevel=1,blocksize=32768}
    35569 |      3 | {compresstype=rle_type,blocksize=32768,compresslevel=1}
(2 rows)
postgres=# select relname, reloptions from pg_class where relname = 't';
 relname |       reloptions
---------+-------------------------
 t       | {compresstype=rle_type}
(1 row)

postgres=# select relname, compresstype from pg_appendonly a, pg_class c where a.relid = c.oid;
 relname | compresstype
---------+--------------
 t       |
(1 row)

Then if we add new column, it fails to use the table-level option:

postgres=# alter table t add column d int;
ALTER TABLE
postgres=# select * from pg_attribute_encoding;
 attrelid | attnum |                       attoptions
----------+--------+---------------------------------------------------------
    35569 |      1 | {compresstype=zlib,compresslevel=1,blocksize=32768}
    35569 |      2 | {compresstype=rle_type,compresslevel=1,blocksize=32768}
    35569 |      3 | {compresstype=rle_type,blocksize=32768,compresslevel=1}
    35569 |      4 | {compresstype=none,blocksize=32768,compresslevel=0}
(4 rows)

But if I applied a similar fix as mentioned in #8207 (comment) (not exactly the same because the code changed), new column will be able to utilize the table-level options.

Therefore, it seems that this comment is not correct:

		/* 
		 * Need to remove table level compression settings for the
		 * AOCO case since they're set at the column level.
		 */

The table-level option will be used when column-level option is absent. And we should keep them in pg_appendonly when we alter the table with a rewrite (i.e. suggestion #1 as @soumyadeep2007 mentioned in #8207 (comment)).

@RekGRpth
Copy link
Contributor

RekGRpth commented Aug 1, 2022

6x fixed by #13892
7x fixed by #13847

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants