[Ideas] Description of backward-incompatible changes between Greenplum 6 and Cloudberry in the documentation. #1816
Replies: 7 comments 2 replies
-
|
Here are the backward-incompatible changes blocking the upgrade that we were able to identify. Upgrade blockersIncompatible change in the internal format of the
|
Beta Was this translation helpful? Give feedback.
-
|
The following backward-incompatible changes were identified in the behavior of functions and operators. Changes in function and operator behaviorChange in the behavior of set-returning functionsStarting with PostgreSQL kernel version 10, the behavior of Set Returning Functions, such as The difference can be seen in the following example. Greenplum 6: SELECT 'example' AS value, generate_series(1, 3), generate_series(1, 2);
value | generate_series | generate_series
---------+-----------------+-----------------
example | 1 | 1
example | 2 | 2
example | 3 | 1
example | 1 | 2
example | 2 | 1
example | 3 | 2
(6 rows)Cloudberry: SELECT 'example' AS value, generate_series(1, 3), generate_series(1, 2);
value | generate_series | generate_series
---------+-----------------+-----------------
example | 1 | 1
example | 2 | 2
example | 3 |
(3 rows)How to detect Check the application codebase for uses of Set Returning Functions in the How to fix Replace calls to Set Returning Functions in the The return type of the
|
Beta Was this translation helpful? Give feedback.
-
|
We also identified backward-incompatible changes in syntactic constructs. Syntax changes and removed constructsExplicit
|
Beta Was this translation helpful? Give feedback.
-
|
A large block of backward-incompatible changes related to partitioning. Changes in partitioning mechanismsRequirement to include all partition keys in unique indexesUnlike Greenplum 6, in Cloudberry a unique index must include all partition keys. Example of a query that works in Greenplum 6: CREATE TABLE sales (
id int NOT NULL,
region text,
amount decimal(10, 2)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (region)
(
VALUES ('Asia'),
VALUES ('Europe')
);
INSERT INTO sales (id, region, amount)
VALUES
(1, 'Asia', 100.00),
(1, 'Europe', 150.00),
(2, 'Asia', 200.00);
CREATE UNIQUE INDEX sales_id_idx ON sales (id);The same query does not work in Cloudberry: CREATE TABLE sales (
id int NOT NULL,
region text,
amount decimal(10, 2)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (region)
(
VALUES ('Asia'),
VALUES ('Europe')
);
INSERT INTO sales (id, region, amount)
VALUES
(1, 'Asia', 100.00),
(1, 'Europe', 150.00),
(2, 'Asia', 200.00);
CREATE UNIQUE INDEX sales_id_idx ON sales (id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: UNIQUE constraint on table "sales" lacks column "region" which is part of the partition key.How to detect All unique indexes that do not include all partition keys can be found with the following query: WITH partitions AS (
SELECT DISTINCT n.nspname, c.relname, c.oid, p.paratts
FROM pg_partition p
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON n.oid = c.relnamespace
)
SELECT p.nspname, p.relname, ir.relname AS indexname
FROM pg_index i
JOIN partitions p ON p.oid = i.indrelid
JOIN pg_class ir ON ir.oid = i.indexrelid
WHERE (i.indisunique OR i.indisprimary)
AND NOT (p.paratts <@ i.indkey);How to fix Recreate the found indexes, adding the partition keys. Limitation on list partitioning by multiple columnsIn Cloudberry, list partitioning by multiple columns is implemented through subpartitioning: the standard Cloudberry syntax assumes one column or expression at each level. This limitation is related to the fact that list-based partitioning by multiple columns, for example The following example works in Greenplum 6: CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b, c)
(
DEFAULT PARTITION other
);But it does not work in Cloudberry: CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b, c)
(
DEFAULT PARTITION other
);
ERROR: cannot use "list" partition strategy with more than one columnHow to detect Tables with list-based partitioning by multiple columns can be found with the following query: SELECT n.nspname, c.relname
FROM pg_partition p
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE parkind = 'l'
AND parnatts > 1;How to fix In Cloudberry, subpartitioning should be used for this: CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b)
SUBPARTITION BY LIST (c)
SUBPARTITION TEMPLATE (
DEFAULT SUBPARTITION other_c
)
(
DEFAULT PARTITION other
);Support for
|
Beta Was this translation helpful? Give feedback.
-
|
Backward-incompatible changes in the system catalog and metadata were also identified. Changes in the system catalog and metadataRemoved and renamed columns in system catalog views and tablesWhen moving to Cloudberry, together with the PostgreSQL kernel and Greenplum 6 upgrade, the following columns in system catalog views and tables were removed or renamed:
How to detect Check the application codebase for uses of removed or renamed columns. How to fix Make the corresponding changes in the application codebase. Removal of system tables and viewsThe following system tables and views have been completely removed in Cloudberry compared to Greenplum 6:
How to detect Check the application codebase for uses of the removed system tables and views. How to fix Make the corresponding changes in the application codebase. To work with partitioned tables in Cloudberry, use the Removal and renaming of system functionsThe following system functions have been completely removed in Cloudberry compared to Greenplum 6:
Functions in the How to detect Check the application codebase for uses of removed and renamed system functions. How to fix Make the corresponding changes in the application codebase. Changes in how sequence metadata is displayedStarting with PostgreSQL kernel version 10, the way sequence metadata is displayed has changed. Example for Greenplum 6: CREATE SEQUENCE example_seq
START WITH 100
INCREMENT BY 1;
SELECT * FROM example_seq;
sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called
------------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
example_seq | 100 | 100 | 1 | 9223372036854775807 | 1 | 1 | 0 | f | fExample for Cloudberry: CREATE SEQUENCE example_seq
START WITH 100
INCREMENT BY 1;
SELECT * FROM example_seq;
last_value | log_cnt | is_called
------------+---------+-----------
100 | 0 | f
SELECT * FROM pg_sequence WHERE seqrelid = 'example_seq'::regclass;
seqrelid | seqtypid | seqstart | seqincrement | seqmax | seqmin | seqcache | seqcycle
----------+----------+----------+--------------+---------------------+--------+----------+----------
17186 | 20 | 100 | 1 | 9223372036854775807 | 1 | 20 | f
SELECT * FROM pg_sequences WHERE sequencename = 'example_seq';
schemaname | sequencename | sequenceowner | data_type | start_value | min_value | max_value | increment_by | cycle | cache_size | last_value
------------+-----------------+---------------+-----------+-------------+-----------+---------------------+--------------+-------+------------+------------
public | example_seq | user1 | bigint | 100 | 1 | 9223372036854775807 | 1 | f | 20 |How to detect Check the application codebase for queries of the form How to fix Make the corresponding changes in the application codebase by querying Change in the internal representation of external tablesInside Cloudberry, external tables are represented as foreign tables. Compared to Greenplum 6, this results in the following differences:
How to detect Check the application codebase and scripts for references to external table metadata. The list of external tables can be obtained with the following query: SELECT n.nspname, c.relname, c.relkind, e.urilocation, e.execlocation, e.fmttype, e.fmtopts
FROM pg_exttable e
JOIN pg_class c ON c.oid = e.reloid
JOIN pg_namespace n ON n.oid = c.relnamespace;How to fix Make changes to the application codebase and scripts according to these changes. The
|
Beta Was this translation helpful? Give feedback.
-
|
Regarding configuration and resource management, we were able to identify the following backward-incompatible changes. Changes in configuration and resource managementRemoval of configuration parametersThe following configuration parameters have been completely removed in Cloudberry compared to Greenplum 6:
How to detect Check the application codebase for uses of the listed configuration parameters. How to fix Make the corresponding changes in the application codebase by removing the use of the deleted configuration parameters. Change in possible values of the
|
| Parameter | Greenplum 6 | Cloudberry |
|---|---|---|
wal_level |
archive |
replica |
autovacuum_vacuum_cost_delay |
20ms |
2ms |
checkpoint_completion_target |
0.5 | 0.9 |
extra_float_digits |
0 | 1 |
gp_interconnect_address_type |
wildcard |
unicast |
autovacuum |
off |
on for template0 |
optimizer_enable_orderedagg |
off |
on |
vacuum_cost_page_miss |
10 | 2 |
How to detect
Most of these parameters are system-level parameters and affect operation only indirectly. extra_float_digits stands out because it can affect query results.
How to fix
In places where the effect of the new extra_float_digits value is observed, explicitly set extra_float_digits = 0 at the session or transaction level.
Possible changes in query execution plans
Cloudberry contains many changes and improvements in the GPORCA optimizer that may change the query execution plan. Queries will not “break” logically, but they may start executing differently.
Things to consider:
-
New parameters affecting query plan construction have appeared:
enable_partitionwise_aggregate;enable_partitionwise_join;enable_parallel;enable_parallel_dedup_semi_join;enable_parallel_hash;enable_parallel_append;enable_parallel_dedup_semi_reverse_join;enable_parallel_semi_join;jit;jit_above_cost;jit_dump_bitcode;jit_expressions;jit_inline_above_cost;jit_optimize_above_cost;jit_tuple_deforming.
-
GPORCA does not support multi-level partitioned tables.
How to detect
Capture baseline plans for critical queries before migrating to Cloudberry:
EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS) <query>;How to fix
Depending on the nature of the plan changes and their criticality, rewrite the problematic query or adjust the parameters.
Extensions merged into the core
In Cloudberry, the following extensions have been removed, and their functionality has been moved into the core:
gp_parallel_retrieve_cursor;gp_array_agg;gp_percentile_agg.
How to detect
Check whether these extensions are installed and whether they are used in your codebase:
SELECT extname, extversion
FROM pg_extension
WHERE extname IN ('gp_parallel_retrieve_cursor', 'gp_array_agg', 'gp_percentile_agg');How to fix
If necessary, adapt the application codebase and scripts to use the built-in functionality.
Beta Was this translation helpful? Give feedback.
-
|
Most of the changes were identified by analyzing the Greenplum 6, Greenplum 7, and Cloudberry documentation, after which each change was tested. It is likely that not all backward-incompatible changes have been identified. My proposal is to create a page or section in the documentation where we can describe such changes and update the documentation as new ones are discovered. I am ready to submit a PR to the documentation if there is conceptual agreement on adding such a section. It would be great if you could suggest the right place in the documentation where such a section should be created. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Description
I propose adding a section to the Cloudberry documentation that describes backward-incompatible changes between Greenplum 6 and Cloudberry, regardless of their nature.
Use case/motivation
Our team is working on migrating from Greenplum 6 to Cloudberry, and during our research we have identified some backward-incompatible changes. Describing these backward-incompatible changes will help users prepare for migration and perform it in a more predictable way.
I believe this scenario may also be relevant for other potential Cloudberry users and may help attract users who are planning to migrate from Greenplum 6.
Related issues
No response
Are you willing to submit a PR?
Beta Was this translation helpful? Give feedback.
All reactions