ADBDEV-1905 6.17.1 sync by deart2k · Pull Request #232 · arenadata/gpdb

deart2k · 2021-07-22T14:06:36Z

Here are some reminders before you submit the pull request

Add tests for the change
Document changes
Communicate in the mailing list if needed
Pass make installcheck
Review a PR in return to support the community

Some tests enable fsync temporarily, but disable fsync incorrectly finally since they use "gpconfig -r". "gpconfig -r" comments out the guc setting so finally the system uses the default guc value but for guc fsync the default value is on. Fixing this by explicitly set fsync off with gpconfig in those tests. This change should be able to speed up testing a lot since all subsequent tests could be accelerated. Reviewed by: Ashwin Agrawal <aashwin@vmware.com> (cherry picked from commit b43ffa0)

NOT IN is equivalent to "<> ALL" SQL expression and should be replaced with LAS-Apply-NotIn in CSubqueryHandler::FRemoveAllSubquery(). Earlier, ORCA transformed "<> ALL" subquery to a LAS-Apply-NotIn with the apply predicate as a filter over it's inner child (See CXformSelect2Apply). This is problematic when the subquery contains NULLs, as the generated filter is NULL rejecting. Thus, although LAS-Apply-NotIn correctly handles the NULL-semantics of a NOT IN subquery, an incorrect NULL rejecting filter on inner side will prevent the Apply from seeing any NULLs, resulting in wrong results. For example: A LOJ inside the NOT IN subquery (see issue #10967) create table t1(a int) distributed by (a); insert into t1 values (1), (2), (3); set optimizer_join_order='query'; select * from t1 where a not in ( select b.a from t1 a left join t1 b on false); ORCA transforms the following SubqueryAll operator: +--CScalarSubqueryAll(<>)["a" (16)] |--CLogicalLeftOuterJoin | |--CLogicalGet "t1" ("t1") | |--CLogicalConstTableGet | +--CScalarConst (1) +--CScalarIdent "a" (0) to a LAS-Apply-NotIn, with a filter over it's inner child: +--CLogicalLeftAntiSemiApplyNotIn (Reqd Inner Cols: "a" (16)) |--CLogicalGet "t1" ("t1") |--CLogicalSelect | |--CLogicalLeftOuterJoin | | |--CLogicalGet "t1" ("t1"), | | |--CLogicalConstTableGet | | +--CScalarConst (1) | +--CScalarCmp (=) | |--CScalarIdent "a" (0) | +--CScalarIdent "a" (16) +--CScalarConst (1) The filter is NULL rejecting, and thus the inner Left Outer Join is normalized to an Inner Join: +--CLogicalLeftAntiSemiApplyNotIn (Reqd Inner Cols: "b" (16)) |--CLogicalGet "t1" ("t1") |--CLogicalInnerJoin | |--CLogicalGet "t2" ("t2") | |--CLogicalConstTableGet | +--CScalarCmp (=) | |--CScalarIdent "a" (0) | +--CScalarIdent "b" (16) +--CScalarConst (1) The Apply operator sees an empty set on the inner side, thus produce the incorrect output of : (1), (2), (3). This commit fixes issue #10967, by placing the predicate in the LAS-Apply-NotIn, instead of generating a NULL rejecting filter: +--CLogicalLeftAntiSemiApplyNotIn (Reqd Inner Cols: "b" (16)) |--CLogicalGet "t1" ("t1") |--CLogicalLeftOuterJoin | |--CLogicalGet "t2" ("t2") | |--CLogicalConstTableGet | +--CScalarConst (1) +--CScalarCmp (=) |--CScalarIdent "a" (0) +--CScalarIdent "b" (16)

This commit introduces a pg_upgrade check to look for views referencing deprecated tables in a 5X cluster. This is accompanied by 5X PR: #12072, which introduces a pg_upgrade_support UDF to detect such views.

Currently, when recovering segments gprecoverseg updates the pg_hba.conf entries on every acting primary regardless of whether those primaries have mirrors to be recovered. This commit restricts the update to only those primaries that need to be updated. Co-authored-by: Jamie McAtamney <jmcatamney@vmware.com> Co-authored-by: Nikhil Kak <nkak@vmware.com>

This fixes the upgrade path for 6X->6X upgrades which was broken by 67df554 with the following error during a 6X->6X upgrade: Checking for views referencing deprecated tables SQL command failed CREATE OR REPLACE FUNCTION view_references_deprecated_tables(OID) RETURNS BOOL AS '$libdir/pg_upgrade_support' LANGUAGE C STRICT; ERROR: could not find function "view_references_deprecated_tables" in file "/usr/local/greenplum-db-devel/lib/postgresql/pg_upgrade_support.so" For upgrades from 6X->6X, we should not have to run this check as deprecated object detection only applies to major version upgrades.

(cherry picked from commit d0186e8)

The SQL 2003 standard specified 4 different levels of "SQL data access indication": 1. NO SQL 2. CONTAINS SQL 3. READS SQL DATA 4. MODIFIES SQL DATA Upstream PostgreSQL doesn't support any of the above. Greenplum has had syntactic support for this as part of `CREATE FUNCTION` command, but so far we only store that in the catalog (and possibly dump back out), without taking it into consideration in our planning or execution process. Except ORCA does, for a seemingly unnecessary purpose: ORCA opts out of various optimizations when it sees a function that is not "NO SQL" (that's a mouthful). The intent, it seems, is we really want to execute such functions on the QD. But that seems questionable for two fundamental reasons: 1. Forgoing the various optimizations (as evidenced by the test cases in this patch) does NOT ensure we execute these functions on the QD node. 1. "executing on QD" is only an approximate intent, the real desired outcome is that we want to enable a *dispatch* from within functions that contain SQL. But we don't have any control over that, either: often an active dispatch is already in-flight and the plan is effectively non-executable. 1. There is a small niche that *does* match the "execute on QD" intent, namely, querying a "master-only" table (such tables also exist on the segments, but they contain no data) in a SQL function. That's where the `EXECUTE ON COORDINATOR` syntax is the best signal. TL;DR: we are effectively pessimizing for any function that possibly contains SQL, without ensuring that we can achieve the intended goal (dispatch nested queries from QD). We should rethink whether the SQL data access indication is even the right signal for us to achieve such goals (I don't think so). This patch still keeps the data access attribute in metadata (DXL), but removes it from the rest of ORCA, including conditions that used to consider them alongside volatile. As a result, we can expect various more optimal plans in different situations. See the patch for details. One thing that this commit differs from 384e5f8 is that we need to revert to using C++98 syntaxes (default initializers are out). Co-authored-by: Ekta Khanna <ekhanna@pivotal.io> (cherry picked from commit 384e5f8)

* Added extra step * corrected syntax

* Split steps into two processes * changes

* Putting recovermatrix.png in correct subdir. * Additional small fixes

…ld is External Scan (#11970) (#12080) - For the child of UNION ALL, based on the distribution spec of the outer child, the distribution spec is requested on the inner child. In this specific case, the outer child had EdtExternal distribution spec, and it requested Any Distribution spec from the inner child. - UNION ALL derives Replicated Distribution Spec if any of the child has Replicated Distribution Spec. So, for the query: explain select * from ext_tbl union all select * from replicated_tbl where ext_tbl derives External Distribution Spec, and replicated_tbl has Replicated distribution spec. UNION ALL derived Replicated Distribution Spec, and it was requested Singleton Distribution Spec, so a Gather Motion from only 1 segment was added on top of Union ALL, due to which data was missing from the other segments for the external table. The resulting bad plan is below: QUERY PLAN ------------------------------------------------------------------------------------ Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..914.99 rows=1000001 width=4) -> Append (cost=0.00..879.07 rows=2000002 width=4) -> Foreign Scan on ext_tbl (cost=0.00..440.35 rows=1000000 width=4) -> Seq Scan on replicated_tbl (cost=0.00..431.00 rows=2 width=4) Ideally, it should have pulled data from all the segments for the external table, and for the replicated table it should have filtered out data from all segments except one. In order to achive the same, this commit replaces External Distribution Spec with Random Distribution Spec because External Distribution spec implies Random Distribution. Now, UNION ALL will request Non Singleton Distribution spec from it's inner child and place a filter on top of the scan over the replicated table. The Derived Spec of UNION ALL will be NULL, and a Gather Motion from 3 segments will be placed on top of it resulting in a plan like below: QUERY PLAN ------------------------------------------------------------------------------------ Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..895.03 rows=1000001 width=4) -> Append (cost=0.00..877.07 rows=500001 width=4) -> Foreign Scan on ext_tbl (cost=0.00..440.35 rows=500000 width=4) -> Result (cost=0.00..431.00 rows=1 width=4) One-Time Filter: (gp_execution_segment() = 0) -> Seq Scan on replicated_tbl (cost=0.00..431.00 rows=1 width=4) Co-authored-by: Bhuvnesh Chaudhary <bchaudhary@vmware.com> Co-authored-by: David Kimura <dkimura@vmware.com> (cherry picked from commit 794cd4e)

CreateReplicationSlot() and DropReplicationSlot() were not cleaning up the query buffer in some cases (mostly error conditions) which meant a small leak. Not generally an issue as the error case would result in an immediate exit, but not difficult to fix either and reduces the number of false positives from code analyzers. In passing, also add appropriate PQclear() calls to RunIdentifySystem(). Pointed out by Coverity. (cherry picked from commit 273b29d)

AIX has never been supported for GP6. Remove code supporting AIX in the enterprise build system. Authored-by: Brent Doil <bdoil@vmware.com>

* Fix typos * Correct command option * Update link * Correct example insert statement * Add note: run commands as root user * Redact info Co-authored-by: David Yozie <dyozie@pivotal.io>

* Adding missing option and its explanation * Changing order of options, per reviewer feedback.

…ides (#11903) * First pass at synching up backup/restore content across both guides * Adding new command to menu

gpmovemirrors and gprecoverseg do no spawn the correct number of parallel processes as intended by -B option. This change fixes that behavior We would be using two options for finer contol. '-B' option limits the number of hosts working in parallel for recovery '-b' option limits the number of segments working in parallel on a host Co-authored-by: David Krieger <dkrieger@vmware.com> (cherry picked from commit 695a73a)

Adding test steps and tests to check if the -B and -b options passed in to gpmovemirrors and gprecoverseg are used as intended (cherry picked from commit 25e463d)

gprecoverseg -r didn't pick up the -B/-b options for spawning correct number of parallel processes. This change fixes that behaviour. Co-authored-by: Orhan Kislal <okislal@vmware.com> (cherry picked from commit 00d691d)

gpaddmirrors currently does not spawn correct number of parallel processes as intended by -B option. This change fixes that behavior. We would be using two options -B/-b for finer contol. (cherry picked from commit 9c399a7)

* Change GPText to Tanzu Greenplum Text in GPDB content. * Reviewer feedback

* added x-log * Added refernce to Xlog

If one (or more) of the hosts are unreachable, gprecoverseg -p showed an error (ssh: unreachable host) even though the operation continued and finalized successfully. This commit adds checks to eliminate these redundant ssh calls. Co-authored-by: Orhan Kislal <okislal@vmware.com>

At the moment, an error will always generate a debug_query_string exception message in the logs even when log_min_messages is set to PANIC. This can be a security concern if the query string is not meant to be logged (e.g. a failed pgcrypto decrypt query is logged which can be equivalent to logging a plaintext password). Fix the issue by changing the call function from write_stderr to elog. With elog, the log message will be controlled by log_min_messages and may enhance the original debugging intent (the debug_query_string log will always show the user-executed query along with the actual query that failed). GPDB references: https://github.com/greenplum-db/gpdb/pull/1456 https://github.com/greenplum-db/gpdb/commit/bd7f682392b30b859588d0d60f2625095f1cd63f Backport of https://github.com/greenplum-db/gpdb/commit/e1c553bb9e4772a4124f7c2a41c552998f14cf88.

This is a regression caused by commit: 6ae9179. The refactor forget to consider agg on entryDB. So fix the issue and add a test case to cover it.

some problem may caused by standard_conforming_strings, we check the standard_conforming_strings in gpdb and give some advice for it

grps_tlist for MppGroupContext may contains extra vars with the group expression in some cases, for example subplan in group expression. But for normal case it only contains group expression in grps_tlist. We used to sure that there could only be one target entry in grps_tlist for a group. When we build the 3 stage agg and try to split the Aggref and find or add target list into different Aggref stages with function `split_aggref`, it'll match to wrong Vars in `AGGSTAGE_INTERMEDIATE` and `AGGSTAGE_FINAL` stage. Reviewed-by: Zhenghua Lyu <zlyu@vmware.com> Reviewed-by: Wen Lin <linw@vmware.com>

Previous commit d87d5e6 use GUC optimizer_enable_groupagg to force orca generate same plan with planner for a test, but seems it can not guarantee this right now. So disable orca for this case.

Stolb27 · 2021-07-23T04:36:07Z

GPORCA conflicts are caused by the mismatch of our commit and upstream backport. Upstream backport contains only part of our changes. Moreover, we can't create a minidump for AllSubqueryWithSubqueryInScalar with the upstream version of the patch, because ORCA can't produce a valid plan:

test=#      EXPLAIN SELECT * FROM foo WHERE (SELECT a FROM foo limit 1) = ALL(SELECT b FROM bar);
LOG:  statement: EXPLAIN SELECT * FROM foo WHERE (SELECT a FROM foo limit 1) = ALL(SELECT b FROM bar);
LOG:  2021-07-23 03:19:50:933783 UTC,THD000,TRACE,"[OPT]: Number of plan alternatives: 12
",
2021-07-23 03:19:50:942197 UTC,THD000,ERROR,"DXL-to-PlStmt Translation: Attribute number 22661984 not found in project list",
LOG:  Pivotal Optimizer (GPORCA) failed to produce plan (unexpected)

We don't which query creates AllSubqueryWithSubqueryInScalar.mdp but it passes the test successfully.
After discussion with @darthunix, we decided to revert the upstream patch and open an issue about the problem.

Upstream implementation break down minidump creation for AllSubqueryWithSubqueryInScalar.mdp (see #232) This reverts commit af7c79b.

we remove such functionality previously 89329d8 but it was added again in 1e375f0

Stolb27 · 2021-07-27T07:31:39Z

69f0824 implements new gpload test suite that requires pytest in the environment.

1e375f0 removes gcc sourcing from test pipelines Upstream are going to use docker environment variables to inject PATH and LD_LIBRARY_PATH But this variables is applied only for root, but not for gpadmin user that run test too. But gpadmin uses gcc to build debug extensions during test run. As a result, gpadmin uses system gcc with pg_config flags from GCC 7

69f0824 introduced pytest tests for gpload its backport from master and currently it doestn't work correctly and is disabled for different distributions (centos6, redhat, ubuntu). I've decided to disable this tests and open ADBDEV-1915 to resolve this problem later.

Stolb27 · 2021-07-28T23:33:18Z

@maksm90, @darthunix, @leskin-in this PR is ready for review. Please, take a look.

darthunix

We have a series of potentially dangerous commits about fsync on startup (I wish we would not be in trouble one day because if them). It was also funny to see that we still have logical errors in resource queues.

Thanks @Stolb27 for his bug fixes in a current release.

Stolb27 · 2021-08-03T01:57:13Z

894d868 removes ORCA's usages for prodataaccess attributes
c99892a by @InnerLife0. We need to rebase and correct Check if subpath can do motion. #219 after merging this one

In python 3.12+, adding a backslash before a non-special character results in SyntaxWarning. This escaping is often used in regex strings. This patch fixes string literals that cause such a warning. Regex strings are now raw strings. In the other strings, escaping is fixed.

paul-guo- and others added 30 commits June 3, 2021 15:34

pg_upgrade check: views with deprecated tables

67df554

This commit introduces a pg_upgrade check to look for views referencing deprecated tables in a 5X cluster. This is accompanied by 5X PR: #12072, which introduces a pg_upgrade_support UDF to detect such views.

Backfill 4 MDP tests for FuncDataAccess

4584895

(cherry picked from commit d0186e8)

Added reload command (#12006)

7cb7203

Docs: Pgcrypto - Added extra step to create extension (#11995)

77b3093

* Added extra step * corrected syntax

Docs: edited provided gplogfilter example (#11996)

57eedf8

Docs: Split Restoring Master Mirroring steps into two processes (#11925)

ee98db5

* Split steps into two processes * changes

Docs: corrected statement about gplogfilter (#12023)

f669bcf

Docs: incorrect guc gp_external_max_segments (#12035)

526cd2f

DOCS: Fixes per docs feedback #5 (#11914)

24e1868

* Putting recovermatrix.png in correct subdir. * Additional small fixes

Various fixes in response to docs feedback (#11947)

20f9c93

Remove AIX build support from gpAux.

e226485

AIX has never been supported for GP6. Remove code supporting AIX in the enterprise build system. Authored-by: Brent Doil <bdoil@vmware.com>

Assorted documentation fixes (#11990)

480baab

* Fix typos * Correct command option * Update link * Correct example insert statement * Add note: run commands as root user * Redact info Co-authored-by: David Yozie <dyozie@pivotal.io>

DOCS: Adding missing "i" option and its explanation to table (#11976)

6939f09

* Adding missing option and its explanation * Changing order of options, per reviewer feedback.

DOCS: First pass at synching up backup/restore content across both gu…

c1405cd

…ides (#11903) * First pass at synching up backup/restore content across both guides * Adding new command to menu

Add behave tests: movemirrors & recoverseg -B/-b

d7c854f

Adding test steps and tests to check if the -B and -b options passed in to gpmovemirrors and gprecoverseg are used as intended (cherry picked from commit 25e463d)

recoverseg rebalance fix -B option

54261fb

gprecoverseg -r didn't pick up the -B/-b options for spawning correct number of parallel processes. This change fixes that behaviour. Co-authored-by: Orhan Kislal <okislal@vmware.com> (cherry picked from commit 00d691d)

gpaddmirrors: fix parallel processes -B/-b option

489b89c

gpaddmirrors currently does not spawn correct number of parallel processes as intended by -B option. This change fixes that behavior. We would be using two options -B/-b for finer contol. (cherry picked from commit 9c399a7)

uniform default/max options across utils

0f30f1d

DOCS: Change GPText to Tanzu Greenplum Text in GPDB content. (#12108)

c88b58f

* Change GPText to Tanzu Greenplum Text in GPDB content. * Reviewer feedback

Docs: Xlog (#12034)

1b10b69

* added x-log * Added refernce to Xlog

(Jerome)Junfeng Yang and others added 5 commits July 20, 2021 09:59

Agg on entryDB shouldn't raise error

d87d5e6

This is a regression caused by commit: 6ae9179. The refactor forget to consider agg on entryDB. So fix the issue and add a test case to cover it.

complete gpload error message when cannot create external table (#12337)

8990c0a

some problem may caused by standard_conforming_strings, we check the standard_conforming_strings in gpdb and give some advice for it

Stabilize the test resutls for gp_aggregates

3633e36

Previous commit d87d5e6 use GUC optimizer_enable_groupagg to force orca generate same plan with planner for a test, but seems it can not guarantee this right now. So disable orca for this case.

ADBDEV-1905 ADB 6.17.1 sync

4968f73

deart2k requested review from Stolb27, darthunix, leskin-in and maksm90 July 22, 2021 14:06

deart2k changed the title ~~6.17.1 sync~~ ADBDEV-1905 6.17.1 sync Jul 22, 2021

Stolb27 added 3 commits July 23, 2021 05:05

Revert "Fix wrong results from nested LOJ in ALL subqueries"

d178938

Upstream implementation break down minidump creation for AllSubqueryWithSubqueryInScalar.mdp (see #232) This reverts commit af7c79b.

Merge remote-tracking branch 'arena/adb-6.x' into 6.17.1-sync

6aafaf5

remove python vendoring

b631606

we remove such functionality previously 89329d8 but it was added again in 1e375f0

Stolb27 force-pushed the 6.17.1-sync branch 3 times, most recently from 2cad289 to d231860 Compare July 28, 2021 03:12

Stolb27 force-pushed the 6.17.1-sync branch from d231860 to 01293d9 Compare July 28, 2021 05:55

Merge branch 'adb-6.x' into 6.17.1-sync

00de312

darthunix approved these changes Jul 29, 2021

View reviewed changes

Stolb27 approved these changes Aug 3, 2021

View reviewed changes

deart2k merged commit f215ca3 into adb-6.x Aug 4, 2021

deart2k deleted the 6.17.1-sync branch August 4, 2021 06:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADBDEV-1905 6.17.1 sync#232

ADBDEV-1905 6.17.1 sync#232
deart2k merged 152 commits intoadb-6.xfrom
6.17.1-sync

deart2k commented Jul 22, 2021

Uh oh!

Stolb27 commented Jul 23, 2021

Uh oh!

Stolb27 commented Jul 27, 2021

Uh oh!

Stolb27 commented Jul 28, 2021

Uh oh!

darthunix left a comment

Uh oh!

Stolb27 commented Aug 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

deart2k commented Jul 22, 2021

Here are some reminders before you submit the pull request

Uh oh!

Stolb27 commented Jul 23, 2021

Uh oh!

Stolb27 commented Jul 27, 2021

Uh oh!

Stolb27 commented Jul 28, 2021

Uh oh!

darthunix left a comment

Choose a reason for hiding this comment

Uh oh!

Stolb27 commented Aug 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants