Skip to content

ADBDEV-901 ADB 6.8.1 sync#83

Merged
deart2k merged 113 commits intoadb-6.xfrom
6.8.1-sync
Jun 17, 2020
Merged

ADBDEV-901 ADB 6.8.1 sync#83
deart2k merged 113 commits intoadb-6.xfrom
6.8.1-sync

Conversation

@deart2k
Copy link
Member

@deart2k deart2k commented Jun 15, 2020

No description provided.

David Yozie and others added 30 commits April 29, 2020 10:22
According to plannode.h "plan_node_id" should be unique across
entire final plan tree. But ORCA DXL to PlanStatement translator
returns uninitialized zero values for BitmapOr and BitmapAnd nodes.
This behaviour differs from Postgres planner and from all other
node translations in this class. It was fixed.

(cherry picked from commit 53a0b78)
Previously, some functions returned various fixed strings and others
failed with a cache lookup error.  Per discussion, standardize on
returning NULL.  Although user-exposed "cache lookup failed" error
messages might normally qualify for bug-fix treatment, no back-patch;
the risk of breaking user code which is accustomed to the current
behavior seems too high.

Michael Paquier

Original Postgres commit:
postgres/postgres@976b24f
…… (#10034)

* Update docs around ssl_ciphers default, default behavior, TLS 1.2 recommendations

* Update cipher string per Stanley's feedback
* docs - update for gpstart when standby master is not available

* docs - minor edits.

* Update gpstart.xml

Co-authored-by: David Yozie <dyozie@pivotal.io>
…es. (#10023)

* docs - clarify text for resource group CPU core usage in catalog tables.

* docs - minor edits
This commit enables parallel writes for Foreign Data Wrapper. This
feature is currently missing from the FDW framework, whilst parallel
scans are supported, parallel writes are missing. FDW parallel writes
are analogous to writing to writable external tables that run on all
segments.

One caveat is that in the external table framework, writable tables
support a distribution policy:

    CREATE WRITABLE EXTERNAL TABLE foo (id int)
    LOCATION ('....')
    FORMAT 'CSV'
    DISTRIBUTED BY (id);

In foreign tables, the distribution policy cannot be defined during the
table definition, so we assume random distribution for all foreign
tables.

Parallel writes are enabled when the foreign table's exec_location is
set to FTEXECLOCATION_ALL_SEGMENTS only. For foreign tables that run on
master or any segment, the current policy behavior remains.
Previously, gpintsystem was incorrectly filling the hostname field of each
segment in gp_segment_configuration with the segment's address. This commit
changes it to correctly resolve hostnames and update the catalog accordingly.

This reverts commit 12ef735, Revert "gpinitsystem: update catalog with
correct hostname".
   Commit message from 12ef735:
   The commit requires some additional tweaks to the input file logic for
   backwards compatibility purposes, so we're reverting this until the full
   fix is ready.
Previously, gpinitsystem did not allow the user to specify a hostname
and address for each segment in the input file used with -I; it only
accepted one value per segment and used it for both hostname and
address.

This commit changes the behavior so that the user can specify both
hostname and address. If the user specifies only the address (such as by
using an old config file), it will preserve the old behavior and set
both hostname and address to that value. It also adds a few tests around
input file parsing so SET_VAR is more resilient to further refactors.

The specific changes involved are the following:

1) Change SET_VAR to be able to parse either the old format (address
only) or new format (host and address) of the segment array
representation.
2) Move SET_VAR from gpinitsystem to gp_bash_functions.sh and remove the
redundant copy in gpcreateseg.sh.
3) Remove a hardcoded "~0" in QD_PRIMARY_ARRAY in gpinitsystem,
representing a replication port value, that was left over from 5X.
4) Improve the check for the number of fields in the segment array
representation.

Also, Remove use of ignore warning flag and use [[ ]] the check for IGNORE_WARNING
Commit 13a1f66 forgot to modify the corresponding LockTagTypeNames array. 
This commit fixes this.
The dbms_pipe_session_{A,B} tests are flaky in CI as it can so happen
that session B calls receiveFrom() before session A can even call
createImplicitPipe(). This leads to flaky test failures such as:

--- /tmp/build/e18b2f02/gpdb_src/gpcontrib/orafce/expected/dbms_pipe_session_B.out	2020-04-20 17:02:27.270832458 +0000
+++ /tmp/build/e18b2f02/gpdb_src/gpcontrib/orafce/results/dbms_pipe_session_B.out	2020-04-20 17:02:27.278832994 +0000
@@ -7,14 +7,6 @@

 -- Receives messages sent via an implicit pipe
 SELECT receiveFrom('named_pipe');
-NOTICE:  RECEIVE 11: Message From Session A
-NOTICE:  RECEIVE 12: 01-01-2013
-NOTICE:  RECEIVE 13: Tue Jan 01 09:00:00 2013 PST
-NOTICE:  RECEIVE 23: \201
-NOTICE:  RECEIVE 24: (2,rob)
-NOTICE:  RECEIVE 9: 12345
-NOTICE:  RECEIVE 9: 12345.6789
-NOTICE:  RECEIVE 9: 99999999999
  receivefrom
 -------------

@@ -152,12 +144,13 @@
 ORDER BY 1;
       name      | items | limit | private |      owner
 ----------------+-------+-------+---------+-----------------
+ named_pipe     |     9 |    10 | f       |
  pipe_name_3    |     1 |       | f       |
  private_pipe_1 |     0 |    10 | t       | pipe_test_owner
  private_pipe_2 |     9 |    10 | t       | pipe_test_owner
  public_pipe_3  |     0 |    10 | f       |
  public_pipe_4  |     0 |    10 | f       |
-(5 rows)
+(6 rows)

This commit introduces an explicit sleep at the start of session B to
give session A a better chance to run.

Co-authored-by: Jesse Zhang <jzhang@pivotal.io>
(cherry picked from commit 985c5e2)
This is a backport from master 2a7b2bf
Target partitions need new ResultRelInfos and override previous
estate->es_result_relation_info in NextCopyFromExecute(). The
new ResultRelInfo may leave its resultSlot as NULL. If sreh is
on, the parsing errors will be caught and loop back to parse
another row; however, the estate->es_result_relation_info was
already changed. This can cause crash.

Reproduce:

```sql
CREATE TABLE partdisttest(id INT, t TIMESTAMP, d VARCHAR(4))
DISTRIBUTED BY (id)
PARTITION BY RANGE (t)
(
  PARTITION p2020 START ('2020-01-01'::TIMESTAMP) END ('2021-01-01'::TIMESTAMP),
  DEFAULT PARTITION extra
);

COPY partdisttest FROM STDIN LOG ERRORS SEGMENT REJECT LIMIT 2;
1	'2020-04-15'	abcde
1	'2020-04-15'	abc
\.
```

Authored-by: ggbq <taos.alias@outlook.com>
(cherry picked from commit 4eebb0e)
(cherry picked from commit c975679)
With this, the xerces headers are not pulled into the xforms/ files.
Makes each .o file about 100 kB shorter. Shrinks the postgres binary
from about 128 MB to 121 MB, with assertions and debugging enabled.

(cherry picked from commit 35cfc37)
Try to not pull in unnecessary dependencies in header files.

(cherry picked from commit 632ad76)
CMemoryPool.h is included literally everywhere, because it comes with
gpos/base.h. Every little there helps.

(cherry picked from commit 99a0066)
Let's keep base.h as slim as possible.

(cherry picked from commit b88c819)
Avoid including dxlops.h, which pulls *all* the CParseHandler header
files. Makes the postgres binary (with assertions and debugging
information) about 1.5 MB smaller.

(cherry picked from commit 529ce1a)
(cherry picked from commit 347fba3)
ops.h brings in the headers for *all* the in include/gpopt/operators/,
which is way more than is needed in most cases.

(cherry picked from commit 143dd82)
They use GPOS_RESET_EX, which needs ITask.

Fix missing includes in unit tests.

(cherry picked from commit 88f9744)
DPE stats are computed when we have a dynamic partition selector that's
applied on another child of a join. The current code continues to use
DPE stats even for the common ancestor join and nodes above it, but
those nodes aren't affected by the partition selector.

Regular Memo groups pick the best expression among several to compute
stats, which makes row count estimates more reliable. We don't have
that luxury with DPE stats, therefore they are often less reliable.

By minimizing the places where we use DPE stats, we should overall get
more reliable row count estimates with DPE stats enabled.

The fix also ignores DPE stats with row counts greater than the group
stats. Partition selectors eliminate certain partitions, therefore
it is impossible for them to increase the row count.
Looks like we were missing an "extern" in two places. While I was at it,
also tidy up guc_gp.c by moving the definition of Debug_resource_group
into cdbvars.c, and add declaration of
gp_encoding_check_locale_compatibility to cdbvars.h.

This is uncovered by building with GCC 10 and Clang 11, where
-fno-common is the new default [1][2] (vis a vis -fcommon). I could also
reproduce this by turning on "-fno-common" in older releases of GCC and
Clang.

We were relying on a myth (or legacy compiler behavior, rather) that C
tentative definitions act _just like_ declarations -- in plain English:
missing an "extern" in a global variable declaration-wannabe wouldn't
harm you, as long as you don't put an initial value after it.

This resolves #10072.

[1] "3.17 Options for Code Generation Conventions: -fcommon"
https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Code-Gen-Options.html#index-tentative-definitions
[2] "Porting to GCC 10" https://gcc.gnu.org/gcc-10/porting_to.html
[3] "[Driver] Default to -fno-common for all targets" https://reviews.llvm.org/D75056

(cherry picked from commit ee7eb0e)
gpload in the latest windows client package requires VS redistributable
package. Output more meaningful message if pg.py fails to load.
David Yozie and others added 22 commits June 3, 2020 11:49
Orca uses this property for cardinality estimation of joins.
For example, a join predicate foo join bar on foo.a = upper(bar.b)
will have a cardinality estimate similar to foo join bar on foo.a = bar.b.

Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1)
won't be treated that way, since they are more likely to have a greater
effect on join cardinalities.

Since this is specific to ORCA, we use logic in the translator to determine
whether a function or operator is NDV-preserving. Right now, we consider
a very limited set of operators, we may add more at a later time.

Let's assume that we join tables R and S and that f is a function or
expression that refers to a single column and does not preserve
NDVs. Let's also assume that p is a function or expression that also
refers to a single column and that does preserve NDVs:

join predicate       card. estimate                         comment
-------------------  -------------------------------------  -----------------------------
col1 = col2          |R| * |S| / max(NDV(col1), NDV(col2))  build an equi-join histogram
f(col1) = p(col2)    |R| * |S| / NDV(col2)                  use NDV-based estimation
f(col1) = col2       |R| * |S| / NDV(col2)                  use NDV-based estimation
p(col1) = col2       |R| * |S| / max(NDV(col1), NDV(col2))  use NDV-based estimation
p(col1) = p(col2)    |R| * |S| / max(NDV(col1), NDV(col2))  use NDV-based estimation
otherwise            |R| * |S| * 0.4                        this is an unsupported pred
Note that adding casts to these expressions is ok, as well as switching left and right side.

Here is a list of expressions that we currently treat as NDV-preserving:

coalesce(col, const)
col || const
lower(col)
trim(col)
upper(col)

One more note: We need the NDVs of the inner side of Semi and
Anti-joins for cardinality estimation, so only normal columns and
NDV-preserving functions are allowed in that case.

This is a port of these GPDB 5X and GPOrca PRs:
https://github.com/greenplum-db/gporca/pull/585
https://github.com/greenplum-db/gpdb/pull/10090

(cherry picked from commit 3ccd1eb)

Also updated join.sql expected files with minor motion changes.
* docs - new pxf IGNORE_MISSING_PATH option

* reword default case

* add IGNORE_MISSING_PATH info to relevant profiles

* the action to take

* try to describe why pxf behaviour is not optimal
We now use initplan id to differentiate the tuplestore used by
different INITPLAN functions. INITPLAN will also write the function
result into different tuplestores.

Also fix the bug which appends initplan in the wrong place. It may
generate wrong result in UNION ALL case.

cherry-pick from: 2589a3
…10238)

* docs - add info about moving a query to a different resource group

* need to be superuser

* remove upgrade/downgrade info for master
When introducing a new mirror, we need two steps:
1. start mirror segment
2. update gp_segment_configuration catalog

Previously gp_add_segment_mirror will be called to update
the catalog, but dbid is chosen by get_availableDbId() which
cannot ensure to be the same dbid in internal.auto.conf.
Reported by issue9837

Reviewed-by: Paul Guo <pguo@pivotal.io>
Reviewed-by: Bhuvnesh Chaudhary <bhuvnesh2703@gmail.com>

cherry-pick from commit: f7965d and 1ee999
* Update statement about mirroring recommendations & support

* Updates based on k8s feedback
After gprecoverseg, need to wait until the cluster is synchronized before
running subsequent tests.

Reviewed-by: Hubert Zhang <hzhang@pivotal.io>
Reviewed-by: Reviewed-by: Ashwin Agrawal <aagrawal@pivotal.io>

Cherry-picked from d490798
…all back (#10265)

We found that when we have window functions and also correlated subqueries in the
same target list, the CQueryMutators::NormalizeWindowProjList method would leave
the varattno attributes of outer references in the subquery unchanged. That needs
to be changed, since we are producing a different RTE for the query.

We will eventually create a fix. For now, this PR just searches for the
problem and triggers a fallback when we see it, to avoid incorrect results.

Co-authored-by: Abhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: Hans Zeller <hzeller@vmware.com>
@deart2k deart2k merged commit 698437d into adb-6.x Jun 17, 2020
RekGRpth pushed a commit that referenced this pull request Dec 3, 2025
Change bug_report format to YAML
@deart2k deart2k deleted the 6.8.1-sync branch December 11, 2025 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comments