Skip to content

ADBDEV-911 ADB 5.28.0 sync#86

Merged
deart2k merged 32 commits intoadb-5.xfrom
5.28.0-sync
Jun 26, 2020
Merged

ADBDEV-911 ADB 5.28.0 sync#86
deart2k merged 32 commits intoadb-5.xfrom
5.28.0-sync

Conversation

@deart2k
Copy link
Member

@deart2k deart2k commented Jun 26, 2020

No description provided.

paul-guo- and others added 30 commits May 27, 2020 15:54
…heckpoint. (#10094)

Those data are stored as extended checkpoint so that prepared transactions are
not forgotten after checkpoint. If we access them without lock, it is possible
that we access some inconsistent data. That will lead to some unknown behaviors.

Reviewed-by: Hao Wu <hawu@pivotal.io>
Reviewed-by: Ashwin Agrawal <aagrawal@pivotal.io>

Cherry-picked from 3c284a3
This is a continuation of commit 456b2b3 in GPORCA. Adding more errors to the list that
doesn't get logged in log file. We are also removing the code that writes to std::cerr,
generating a not very nice looking log message. Instead, add the info whether the error was
unexpected to another log message that we also generate.

This is a cherry-pick of GPDB master PR greenplum-db/gpdb#10100.
The corresponding Orca PR is https://github.com/greenplum-db/gporca/pull/586.
…property (#10090)

Orca uses this property for cardinality estimation of joins.
For example, a join predicate foo join bar on foo.a = upper(bar.b)
will have a cardinality estimate similar to foo join bar on foo.a = bar.b.

Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1)
won't be treated that way, since they are more likely to have a greater
effect on join cardinalities.

Since this is specific to ORCA, we use logic in the translator to determine
whether a function or operator is NDV-preserving. Right now, we consider
a very limited set of operators, we may add more at a later time.
… in ORCA

Duplicate sensitive HashDistribute Motions generated by ORCA get
translated to Result nodes with hashFilter cols set. However, if the
Motion needs to distribute based on a complex expression (rather than
just a Var), the expression must be added into the targetlist of the
Result node and then referenced in hashFilterColIdx.

However, this can affect other operators above the Result node. For
example, a Hash operator expects the targetlist of its child node to
contain only elements that are to be hashed. Additional expressions here
can cause issues with memtuple bindings that can lead to errors.

(E.g The attached test case, when run without our fix, will give an
error: "invalid input syntax for integer:")

This PR fixes the issue by adding an additional Result node on top of
the duplicate sensitive Result node to project only the elements from
the original targetlist in such cases.
This is the beginning of a collection of SQL-callable functions to
verify the integrity of data files.  For now it only contains code to
verify B-Tree indexes.

This adds two SQL-callable functions, validating B-Tree consistency to
a varying degree.  Check the, extensive, docs for details.

The goal is to later extend the coverage of the module to further
access methods, possibly including the heap.  Once checks for
additional access methods exist, we'll likely add some "dispatch"
functions that cover multiple access methods.

Author: Peter Geoghegan, editorialized by Andres Freund
Reviewed-By: Andres Freund, Tomas Vondra, Thomas Munro,
   Anastasia Lubennikova, Robert Haas, Amit Langote
Discussion: CAM3SWZQzLMhMwmBqjzK+pRKXrNUZ4w90wYMUWfkeV8mZ3Debvw@mail.gmail.com
No exclusive lock is taken anymore...
The previous coding of the test was vulnerable against autovacuum
triggering work on one of the tables in check_btree.sql.

For the purpose of the test it's entirely sufficient to check for
locks taken by the current process, so add an appropriate restriction.
While touching the test, expand it to also check for locks on the
underlying relations, rather than just the indexes.

Reported-By: Tom Lane
Discussion: https://postgr.es/m/30354.1489434301@sss.pgh.pa.us
Cherry-pick 382ceff from upstream, only the changes to amcheck.
…_xxx".

contrib/amcheck didn't get the memo either.
* Update statement about mirroring recommendations & support

* Updates based on k8s feedback
gpcheckcat persistent check is consist of a series of extra/missing
check, one is to detect the extra relation files in the file system
after a table is dropped, the main idea is:
1. using gp_persistent_relation_node_check() to list all relation
   files in the filesystem
2. say whether catalog pg_class contains the entries referring to
   those relfilenodes.
3. say whether catalog pg_persistent_relation_node contains the
   entries referring to those relfilenodes.

Another background is:
heap tables only ever store segment_file_num 0 in the persistent
tables, while ao/co tables will store every segment_file_num that
they use, so we need to handle the difference in the SQL file,
otherwise, segment_file_num > 0 will be treated as extra files.

With the old filter in the SQL file, extra segment files > 0 cannot
be detected, so enhance the filter.
Tests ahead of persistent_filesystem already made persistent errors, so
it could not tell whether the persistent error was triggered by
persistent_filesystem test or not, so move persistent_filesystem to the
location where no persisitent error is generated yet.
* docs - add views pg_stat_all_tables and indexes - 5X

The views currently display access statistics only from master.
Add 6.x specific DDL for views that display access statistics from master and segments.

Also add some statistics GUCs.
--track_activities
--track_counts

* docs - clarify seq_scan and idx_scan refer to the total number of scans from all segments

* docs - minor edits
When the error happens after ProcArrayEndTransaction, it will recurse back to
AbortTransaction, we need to make sure it will not generate extra WAL record
and not fail the assertions.
Following the [Greenplum Server RPM Packaging Specification][0], we need
to update greenplum_path.sh file, and ensure many environment variables
set correct.

There are a few basic requirments for Greenplum Path Layer:

* greenplum-path.sh shall be installed to `${installation
  prefix}/greenplum-db-[package-version]/greenplum_path.sh`
* ${GPHOME} is set by given parameter, by default it should point to
  `%{installation prefix}/greenplum-db-devel`
* `${LD_LIBRARY_PATH}` shall be safely set to avoid a trailing colon
  (which will cause the linker to search the current directory when
  resolving shared objects)
* `${PYTHONHOME}` shall be set to `${GPHOME}/ext/python`
* `${PYTHONPATH}` shall be set to `${GPHOME}/lib/python`
* `${PATH}` shall be set to `${GPHOME}/bin:${PYTHONHOME}/bin:${PATH}`
* If the file `${GPHOME}/etc/openssl.cnf` exists then `${OPENSSL_CONF}`
  shall be set to `${GPHOME}/etc/openssl.cnf`
* The greenplum_path.sh file shall pass [ShellCheck][1]

[0]: https://github.com/greenplum-db/greenplum-database-release/blob/master/Greenplum-Server-RPM-Packaging-Specification.md#detailed-package-behavior
[1]: https://github.com/koalaman/shellcheck

[#171588834]

Co-authored-by: Tingfang Bao <bbao@pivotal.io>
Co-authored-by: Bradford D. Boyle <bradfordb@vmware.com>
Co-authored-by: Xin Zhang <zhxin@vmware.com>
When upgrading from GPDB5 to GPDB6, gpupgrade will need to be able to call
binaries from both major versions. Relying on LD_LIBRARY_PATH is not an option
because this can cause binaries to load libraries from the wrong version.
Instead, we need the libraries to have RPATH/RUNPATH set correctly. Since the
built binaries may be relocated we need to use a relative path.

This commit disables the rpath configure option (which would result in an
absolute path) and exports LD_RUN_PATH to use `$ORIGIN`.

For most ELF files a RUNPATH of `$ORIGIN/../lib` is correct. For pygresql
python module, the RUNPATH needs to be adjusted accordingly.

Authored-by: Shaoqi Bai <sbai@pivotal.io>
The format of the `findstring` function is `$(findstring find,in)` where
it searches `in` for occurences of `find`. The value of BLD_ARCH for
RHEL7 is rhel7_x86_64 and will never be found in `rhel6 rhel7`. This
commit changes the conditional to search for `rhel` in the value of
`BLD_ARCH` and if it is found, set the custom LDFLAGS.

[#171588911]

Co-authored-by: Bradford D. Boyle <bboyle@pivotal.io>
Co-authored-by: Xin Zhang <xzhang@pivotal.io>
Authored-by: Tingfang Bao <baotingfang@gmail.com>
Authored-by: Shaoqi Bai <sbai@pivotal.io>
Currently, GPDB5 is built with --enable-rpath (default configure
option). For plperl, it's Makefile specifies an absolute path to the
location of "$(perl -MConfig -e 'print $Config{archlibexp}')/CORE"
(e.g., /usr/lib64/perl5/CORE on RHEL7). This directory is not on the
default search path for the runtime linker. Without the proper RUNPATH
entry, libperl.so cannot be found when postgres tries to load the plperl
extension.

Without setting correct RUNPATH for plperl.so, will see a error like
followin:
ERROR:  could not load library
"/usr/local/greenplum-db-devel/lib/postgresql/plperl.so": libperl.so:
cannot open shared object file: No such file or directory

Authored-by: Shaoqi Bai <bshaoqi@vmware.com>
We do not vendor python on OSS Ubuntu 16.04 and therefore
`$GPHOME/ext/python` doesn't exist. This causes the system python to
fail with an error message about not being able to find the site module.

This commit updates the logic in generate-greenplum_path.sh to check if
we are vendoring python and if we are, it includes setting `$PYTHONHOME`
in greenplum_path.sh. If we are vendoring python, then at build-time
`$PYTHONHOME` is set to point at our vendored python.

[#173046174]

Authored-by: Bradford D. Boyle <bradfordb@vmware.com>
….4 to 3.5

- Treated REVISION an ordinary string. REVISION for the newly added
  1.6.2+vmware.1 has a dot that need to be treated as a ordinary string,
  or else can not match

Authored-by: Shaoqi Bai <bshaoqi@vmware.com>
@deart2k deart2k merged commit 4c9e704 into adb-5.x Jun 26, 2020
RekGRpth pushed a commit that referenced this pull request Dec 3, 2025
Function XactLockTableWait() calls LocalXidGetDistributedXid() which
may get gxid corresponding to local wait xid from distributed clog in
case if the dtx (which we are waiting for) managed to commit by that time we
access its gxid. And for such case there is an assertion introduced by
commit 13a1f66. The assert indicates that the commited transaction was
just running in parallel with current one, meaning there is no other
reason to access distributed transaction history. If the transaction was
commited long time ago the XactLockTableWait() would never be called.

However, there is a case when we can't compare the timestamps:
vacuum operation, which performs in-place update of pg_database
(or pg_class) without being in distributed transaction. For this
case this patch extends the assertion by allowing current timestamp to
have zero value.

The new test related to this case is added to file 2c2753a.
@deart2k deart2k deleted the 5.28.0-sync branch December 11, 2025 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comments