ADBDEV-901 ADB 6.8.1 sync by deart2k · Pull Request #83 · arenadata/gpdb

deart2k · 2020-06-15T06:20:16Z

No description provided.

According to plannode.h "plan_node_id" should be unique across entire final plan tree. But ORCA DXL to PlanStatement translator returns uninitialized zero values for BitmapOr and BitmapAnd nodes. This behaviour differs from Postgres planner and from all other node translations in this class. It was fixed. (cherry picked from commit 53a0b78)

Previously, some functions returned various fixed strings and others failed with a cache lookup error. Per discussion, standardize on returning NULL. Although user-exposed "cache lookup failed" error messages might normally qualify for bug-fix treatment, no back-patch; the risk of breaking user code which is accustomed to the current behavior seems too high. Michael Paquier Original Postgres commit: postgres/postgres@976b24f

Michael Paquier Original Postgres commit: postgres/postgres@3153b1a

…… (#10034) * Update docs around ssl_ciphers default, default behavior, TLS 1.2 recommendations * Update cipher string per Stanley's feedback

* docs - update for gpstart when standby master is not available * docs - minor edits. * Update gpstart.xml Co-authored-by: David Yozie <dyozie@pivotal.io>

…es. (#10023) * docs - clarify text for resource group CPU core usage in catalog tables. * docs - minor edits

This commit enables parallel writes for Foreign Data Wrapper. This feature is currently missing from the FDW framework, whilst parallel scans are supported, parallel writes are missing. FDW parallel writes are analogous to writing to writable external tables that run on all segments. One caveat is that in the external table framework, writable tables support a distribution policy: CREATE WRITABLE EXTERNAL TABLE foo (id int) LOCATION ('....') FORMAT 'CSV' DISTRIBUTED BY (id); In foreign tables, the distribution policy cannot be defined during the table definition, so we assume random distribution for all foreign tables. Parallel writes are enabled when the foreign table's exec_location is set to FTEXECLOCATION_ALL_SEGMENTS only. For foreign tables that run on master or any segment, the current policy behavior remains.

Previously, gpintsystem was incorrectly filling the hostname field of each segment in gp_segment_configuration with the segment's address. This commit changes it to correctly resolve hostnames and update the catalog accordingly. This reverts commit 12ef735, Revert "gpinitsystem: update catalog with correct hostname". Commit message from 12ef735: The commit requires some additional tweaks to the input file logic for backwards compatibility purposes, so we're reverting this until the full fix is ready.

Previously, gpinitsystem did not allow the user to specify a hostname and address for each segment in the input file used with -I; it only accepted one value per segment and used it for both hostname and address. This commit changes the behavior so that the user can specify both hostname and address. If the user specifies only the address (such as by using an old config file), it will preserve the old behavior and set both hostname and address to that value. It also adds a few tests around input file parsing so SET_VAR is more resilient to further refactors. The specific changes involved are the following: 1) Change SET_VAR to be able to parse either the old format (address only) or new format (host and address) of the segment array representation. 2) Move SET_VAR from gpinitsystem to gp_bash_functions.sh and remove the redundant copy in gpcreateseg.sh. 3) Remove a hardcoded "~0" in QD_PRIMARY_ARRAY in gpinitsystem, representing a replication port value, that was left over from 5X. 4) Improve the check for the number of fields in the segment array representation. Also, Remove use of ignore warning flag and use [[ ]] the check for IGNORE_WARNING

Commit 13a1f66 forgot to modify the corresponding LockTagTypeNames array. This commit fixes this.

The dbms_pipe_session_{A,B} tests are flaky in CI as it can so happen that session B calls receiveFrom() before session A can even call createImplicitPipe(). This leads to flaky test failures such as: --- /tmp/build/e18b2f02/gpdb_src/gpcontrib/orafce/expected/dbms_pipe_session_B.out 2020-04-20 17:02:27.270832458 +0000 +++ /tmp/build/e18b2f02/gpdb_src/gpcontrib/orafce/results/dbms_pipe_session_B.out 2020-04-20 17:02:27.278832994 +0000 @@ -7,14 +7,6 @@ -- Receives messages sent via an implicit pipe SELECT receiveFrom('named_pipe'); -NOTICE: RECEIVE 11: Message From Session A -NOTICE: RECEIVE 12: 01-01-2013 -NOTICE: RECEIVE 13: Tue Jan 01 09:00:00 2013 PST -NOTICE: RECEIVE 23: \201 -NOTICE: RECEIVE 24: (2,rob) -NOTICE: RECEIVE 9: 12345 -NOTICE: RECEIVE 9: 12345.6789 -NOTICE: RECEIVE 9: 99999999999 receivefrom ------------- @@ -152,12 +144,13 @@ ORDER BY 1; name | items | limit | private | owner ----------------+-------+-------+---------+----------------- + named_pipe | 9 | 10 | f | pipe_name_3 | 1 | | f | private_pipe_1 | 0 | 10 | t | pipe_test_owner private_pipe_2 | 9 | 10 | t | pipe_test_owner public_pipe_3 | 0 | 10 | f | public_pipe_4 | 0 | 10 | f | -(5 rows) +(6 rows) This commit introduces an explicit sleep at the start of session B to give session A a better chance to run. Co-authored-by: Jesse Zhang <jzhang@pivotal.io> (cherry picked from commit 985c5e2)

This is a backport from master 2a7b2bf

Target partitions need new ResultRelInfos and override previous estate->es_result_relation_info in NextCopyFromExecute(). The new ResultRelInfo may leave its resultSlot as NULL. If sreh is on, the parsing errors will be caught and loop back to parse another row; however, the estate->es_result_relation_info was already changed. This can cause crash. Reproduce: ```sql CREATE TABLE partdisttest(id INT, t TIMESTAMP, d VARCHAR(4)) DISTRIBUTED BY (id) PARTITION BY RANGE (t) ( PARTITION p2020 START ('2020-01-01'::TIMESTAMP) END ('2021-01-01'::TIMESTAMP), DEFAULT PARTITION extra ); COPY partdisttest FROM STDIN LOG ERRORS SEGMENT REJECT LIMIT 2; 1 '2020-04-15' abcde 1 '2020-04-15' abc \. ``` Authored-by: ggbq <taos.alias@outlook.com>

(cherry picked from commit 4eebb0e)

(cherry picked from commit c975679)

With this, the xerces headers are not pulled into the xforms/ files. Makes each .o file about 100 kB shorter. Shrinks the postgres binary from about 128 MB to 121 MB, with assertions and debugging enabled. (cherry picked from commit 35cfc37)

Try to not pull in unnecessary dependencies in header files. (cherry picked from commit 632ad76)

CMemoryPool.h is included literally everywhere, because it comes with gpos/base.h. Every little there helps. (cherry picked from commit 99a0066)

…e.h. (cherry picked from commit af6431a)

Let's keep base.h as slim as possible. (cherry picked from commit b88c819)

Avoid including dxlops.h, which pulls *all* the CParseHandler header files. Makes the postgres binary (with assertions and debugging information) about 1.5 MB smaller. (cherry picked from commit 529ce1a)

(cherry picked from commit 347fba3)

ops.h brings in the headers for *all* the in include/gpopt/operators/, which is way more than is needed in most cases. (cherry picked from commit 143dd82)

They use GPOS_RESET_EX, which needs ITask. Fix missing includes in unit tests. (cherry picked from commit 88f9744)

DPE stats are computed when we have a dynamic partition selector that's applied on another child of a join. The current code continues to use DPE stats even for the common ancestor join and nodes above it, but those nodes aren't affected by the partition selector. Regular Memo groups pick the best expression among several to compute stats, which makes row count estimates more reliable. We don't have that luxury with DPE stats, therefore they are often less reliable. By minimizing the places where we use DPE stats, we should overall get more reliable row count estimates with DPE stats enabled. The fix also ignores DPE stats with row counts greater than the group stats. Partition selectors eliminate certain partitions, therefore it is impossible for them to increase the row count.

Looks like we were missing an "extern" in two places. While I was at it, also tidy up guc_gp.c by moving the definition of Debug_resource_group into cdbvars.c, and add declaration of gp_encoding_check_locale_compatibility to cdbvars.h. This is uncovered by building with GCC 10 and Clang 11, where -fno-common is the new default [1][2] (vis a vis -fcommon). I could also reproduce this by turning on "-fno-common" in older releases of GCC and Clang. We were relying on a myth (or legacy compiler behavior, rather) that C tentative definitions act _just like_ declarations -- in plain English: missing an "extern" in a global variable declaration-wannabe wouldn't harm you, as long as you don't put an initial value after it. This resolves #10072. [1] "3.17 Options for Code Generation Conventions: -fcommon" https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Code-Gen-Options.html#index-tentative-definitions [2] "Porting to GCC 10" https://gcc.gnu.org/gcc-10/porting_to.html [3] "[Driver] Default to -fno-common for all targets" https://reviews.llvm.org/D75056 (cherry picked from commit ee7eb0e)

gpload in the latest windows client package requires VS redistributable package. Output more meaningful message if pg.py fails to load.

Orca uses this property for cardinality estimation of joins. For example, a join predicate foo join bar on foo.a = upper(bar.b) will have a cardinality estimate similar to foo join bar on foo.a = bar.b. Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1) won't be treated that way, since they are more likely to have a greater effect on join cardinalities. Since this is specific to ORCA, we use logic in the translator to determine whether a function or operator is NDV-preserving. Right now, we consider a very limited set of operators, we may add more at a later time. Let's assume that we join tables R and S and that f is a function or expression that refers to a single column and does not preserve NDVs. Let's also assume that p is a function or expression that also refers to a single column and that does preserve NDVs: join predicate card. estimate comment ------------------- ------------------------------------- ----------------------------- col1 = col2 |R| * |S| / max(NDV(col1), NDV(col2)) build an equi-join histogram f(col1) = p(col2) |R| * |S| / NDV(col2) use NDV-based estimation f(col1) = col2 |R| * |S| / NDV(col2) use NDV-based estimation p(col1) = col2 |R| * |S| / max(NDV(col1), NDV(col2)) use NDV-based estimation p(col1) = p(col2) |R| * |S| / max(NDV(col1), NDV(col2)) use NDV-based estimation otherwise |R| * |S| * 0.4 this is an unsupported pred Note that adding casts to these expressions is ok, as well as switching left and right side. Here is a list of expressions that we currently treat as NDV-preserving: coalesce(col, const) col || const lower(col) trim(col) upper(col) One more note: We need the NDVs of the inner side of Semi and Anti-joins for cardinality estimation, so only normal columns and NDV-preserving functions are allowed in that case. This is a port of these GPDB 5X and GPOrca PRs: https://github.com/greenplum-db/gporca/pull/585 https://github.com/greenplum-db/gpdb/pull/10090 (cherry picked from commit 3ccd1eb) Also updated join.sql expected files with minor motion changes.

* docs - new pxf IGNORE_MISSING_PATH option * reword default case * add IGNORE_MISSING_PATH info to relevant profiles * the action to take * try to describe why pxf behaviour is not optimal

We now use initplan id to differentiate the tuplestore used by different INITPLAN functions. INITPLAN will also write the function result into different tuplestores. Also fix the bug which appends initplan in the wrong place. It may generate wrong result in UNION ALL case. cherry-pick from: 2589a3

…10238) * docs - add info about moving a query to a different resource group * need to be superuser * remove upgrade/downgrade info for master

…group (#10238)" This reverts commit 0777504.

When introducing a new mirror, we need two steps: 1. start mirror segment 2. update gp_segment_configuration catalog Previously gp_add_segment_mirror will be called to update the catalog, but dbid is chosen by get_availableDbId() which cannot ensure to be the same dbid in internal.auto.conf. Reported by issue9837 Reviewed-by: Paul Guo <pguo@pivotal.io> Reviewed-by: Bhuvnesh Chaudhary <bhuvnesh2703@gmail.com> cherry-pick from commit: f7965d and 1ee999

* Update statement about mirroring recommendations & support * Updates based on k8s feedback

After gprecoverseg, need to wait until the cluster is synchronized before running subsequent tests. Reviewed-by: Hubert Zhang <hzhang@pivotal.io> Reviewed-by: Reviewed-by: Ashwin Agrawal <aagrawal@pivotal.io> Cherry-picked from d490798

…all back (#10265) We found that when we have window functions and also correlated subqueries in the same target list, the CQueryMutators::NormalizeWindowProjList method would leave the varattno attributes of outer references in the subquery unchanged. That needs to be changed, since we are producing a different RTE for the query. We will eventually create a fix. For now, this PR just searches for the problem and triggers a fallback when we see it, to avoid incorrect results. Co-authored-by: Abhijit Subramanya <asubramanya@pivotal.io> Co-authored-by: Hans Zeller <hzeller@vmware.com>

This reverts commit 87fef90.

Change bug_report format to YAML

David Yozie and others added 30 commits April 29, 2020 10:22

Docs - update PostGIS version compatibility

68c39fd

Docs - increment madlib packaging version

e06401b

Docs - remove HCI warning

3a2c6f6

Eliminate a few more user-visible "cache lookup failed" errors.

00377e0

Michael Paquier Original Postgres commit: postgres/postgres@3153b1a

Update docs around ssl_ciphers default, default behavior, TLS 1.2 rec…

eb6b5f3

…… (#10034) * Update docs around ssl_ciphers default, default behavior, TLS 1.2 recommendations * Update cipher string per Stanley's feedback

docs - update for gpstart when standby master is not available (#10035)

a1369e7

* docs - update for gpstart when standby master is not available * docs - minor edits. * Update gpstart.xml Co-authored-by: David Yozie <dyozie@pivotal.io>

docs - clarify text for resource group CPU core usage in catalog tabl…

864f021

…es. (#10023) * docs - clarify text for resource group CPU core usage in catalog tables. * docs - minor edits

Add LockTagTypeNames for distributed xid

38ca0dc

Commit 13a1f66 forgot to modify the corresponding LockTagTypeNames array. This commit fixes this.

Fix a spinlock leak for fault injector

d1b10d6

This is a backport from master 2a7b2bf

Remove obsolete GPOS_SunOs ifdefs.

0fd34b1

(cherry picked from commit 4eebb0e)

Remove some unused code.

067fc72

(cherry picked from commit c975679)

Cleanup dependencies.

7b72de9

With this, the xerces headers are not pulled into the xforms/ files. Makes each .o file about 100 kB shorter. Shrinks the postgres binary from about 128 MB to 121 MB, with assertions and debugging enabled. (cherry picked from commit 35cfc37)

Remove unnecessary includes.

9a5b94d

Try to not pull in unnecessary dependencies in header files. (cherry picked from commit 632ad76)

Remove or move some more headers.

1d3874d

CMemoryPool.h is included literally everywhere, because it comes with gpos/base.h. Every little there helps. (cherry picked from commit 99a0066)

gpos/base/ITask.h isn't actually used very widely, remove it from bas…

54b333b

…e.h. (cherry picked from commit af6431a)

Move typedef WOSTREAM closer to where it's needed.

900d32f

Let's keep base.h as slim as possible. (cherry picked from commit b88c819)

Reduce usage of dxlops.h

6fbf2d6

Avoid including dxlops.h, which pulls *all* the CParseHandler header files. Makes the postgres binary (with assertions and debugging information) about 1.5 MB smaller. (cherry picked from commit 529ce1a)

More header cleanup.

5de19ef

(cherry picked from commit 347fba3)

Replace ops.h with more fine-grained includes.

81b7d6e

ops.h brings in the headers for *all* the in include/gpopt/operators/, which is way more than is needed in most cases. (cherry picked from commit 143dd82)

Fix compilation of unit tests.

520f9a1

They use GPOS_RESET_EX, which needs ITask. Fix missing includes in unit tests. (cherry picked from commit 88f9744)

[skip-ci][6X_STABLE] gpload: improve error message (#10077)

56e1e09

gpload in the latest windows client package requires VS redistributable package. Output more meaningful message if pg.py fails to load.

David Yozie and others added 22 commits June 3, 2020 11:49

Docs - update versioning & build for 6.8 release

c1f648a

Docs - update PXF, gpss versions

235ab4f

Add "FILL_MISSING_FIELDS" option for gpload.

87fef90

Fix plan difference in gporca regression test

ed9b9ee

docs - add pxf v5.12 to supported platforms (#10235)

c15ed69

docs - new pxf IGNORE_MISSING_PATH option (#10010)

9ec512c

* docs - new pxf IGNORE_MISSING_PATH option * reword default case * add IGNORE_MISSING_PATH info to relevant profiles * the action to take * try to describe why pxf behaviour is not optimal

docs - pxf hive column mapping round 2 (#10086)

2f47ccc

Docs - update gpcc compatibility info for 6.8

c9bb336

docs - update some xrefs and upgrade info (#10237)

7d29830

Docs - xml fix

10f6774

docs - add info about moving a query to a different resource group (#…

0777504

…10238) * docs - add info about moving a query to a different resource group * need to be superuser * remove upgrade/downgrade info for master

Revert "docs - add info about moving a query to a different resource …

1796bc5

…group (#10238)" This reverts commit 0777504.

docs - using info for moving a query to a diff resgroup (6x) (#10251)

1d37e0f

Update statement about mirroring recommendations & support (#10206)

dbb6ea7

* Update statement about mirroring recommendations & support * Updates based on k8s feedback

Fix flaky test recoverseg_from_file (#10259)

3c01cba

After gprecoverseg, need to wait until the cluster is synchronized before running subsequent tests. Reviewed-by: Hubert Zhang <hzhang@pivotal.io> Reviewed-by: Reviewed-by: Ashwin Agrawal <aagrawal@pivotal.io> Cherry-picked from d490798

Revert "Add "FILL_MISSING_FIELDS" option for gpload." (#10280)

7118e8a

This reverts commit 87fef90.

ADBDEV-901 Merge 6.8.1 to 6.8.1-sync

0cf6b62

ADBDEV-901 ADB 6.8.1 sync

6024f12

deart2k requested review from Stolb27, darthunix and leskin-in June 15, 2020 06:20

Stolb27 approved these changes Jun 15, 2020

View reviewed changes

darthunix approved these changes Jun 16, 2020

View reviewed changes

deart2k merged commit 698437d into adb-6.x Jun 17, 2020

RekGRpth pushed a commit that referenced this pull request Dec 3, 2025

Merge pull request #83 from dimoffon/main

4010f2e

Change bug_report format to YAML

deart2k deleted the 6.8.1-sync branch December 11, 2025 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADBDEV-901 ADB 6.8.1 sync#83

ADBDEV-901 ADB 6.8.1 sync#83
deart2k merged 113 commits intoadb-6.xfrom
6.8.1-sync

deart2k commented Jun 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments

Conversation

deart2k commented Jun 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments