Conversation
* docs - ditamap updates for 5.29 release * update oss book subnavs
Co-authored-by: David Yozie <dyozie@pivotal.io>
Before current commit we initailized pg_aocsseg entries with frozen tuples to make these entries implicitly visible even after rollback. It is a working approach for AO tables, but AOC tables store additional "vpinfo" structure with serialized information about every column EOF. This can cause inconsistency between "vpinfo" structure in pg_aocsseg entries and the actual number of columns in a table when we modify the amount of columns under explicit transaction and rollback it later. As a result we fail on asserts in "GetAllAOCSFileSegInfo_pg_aocsseg_rel()" or (in a case of a build without asserts) in some unpredictable places in the code when retrieve metadata from pg_aocsseg. As a fix this commit inserts simple tuples to pg_aocsseg but still insert frozen tuples to gp_fastsequence (as it generates row numbers for segment files and should never be rollbacked to avoid inconsistency in index pointers of the AO/AOC tables). Co-authored-by: Vasiliy Ivanov <ivi@arenadata.io> Co-authored-by: Denis Smirnov <sd@arenadata.io>
This relaxes the need to have Debug_filerep_print on for cases where the resync manager skips a relfile for resync. There are some improvements made to these log messages along the way. Additionally, we now log by default whenever a resync worker updates a resync hash table entry as completed. This is done in response to bug reports where a full recovery, following an interrupted resync from an incremental recovery, has inadvertently skipped over scheduling relfiles. The relaxed logging can give us clues as to why certain relfiles were skipped - as all files should be scheduled for resync, when running a full recovery. Co-authored-by: Ashwin Agrawal <aashwin@vmware.com>
This was introduced in aa6c213. fsObjName is not initialized before the logging statement. Use relFileNode instead to log the persistent object being skipped. This was causing PANICs such as: Unexpected persistent file-system object type: 17419088
We were using the passed in entry in the log statement. Use the updated entry instead, as the passed in entry might not have all of its fields populated - as is the case for the callsite in FileRepPrimary_ResyncBufferPoolIncrementalWrite(). Also, in case the lookup returned NULL, we used to log as if nothing happened. So, we add an appropriate WARNING here. Co-authored-by: Jimmy Yih <jyih@pivotal.io>
Remove behave step "the database is killed on hosts mdw,sdw1,sdw2" in favor of "the database is not running. This shuts down the databse more cleanly, and avoids any potential race conditons. (cherry picked from commit cb5d0ba) This was causing the gpaddmirrors behave tests to flake with the following error during gpinitsystem: gpinitsystem:mdw:gpadmin-[FATAL]:-Host sdw1 has an active database process on port = 20500 Script Exiting!: Conflicts: * Decided to keep: @given('the database is killed on hosts "{hostnames}"') since it's still used by gpexpand tests. Removing that can be tackled in a separate commit. * Removed the rm -rf /tmp/gpaddmirrors/* statements - we need the datadirs to gpstop: Incidentally, this was also removed n 6X commit: 76a8111 * check_database_is_running() was getting the wrong value from master_data_directory. Using os.environ['MASTER_DATA_DIRECTORY'] instead. Co-authored-by: Jimmy Yih <jyih@vmware.com>
When primary and unique key constraints are added to a partition hierarchy with an ALTER TABLE ADD statement on the root table, the auto-generated name of the constraint differs on the partition root vs the partition children. As a result, when the same constraint is dropped from the parent, the constraint is NOT dropped from the partition children. To make matters worse, any attempt at dropping the constraint from the children directly, results in: ERROR: can't drop a constraint from <child_table_name>; it is part of a partitioned table HINT: You may be able to perform the operation on the partitioned table as a whole. The above was first raised in issue: #3750 and was fixed in 6X+ by #7047 which employs large changes to constraint and partition table infrastructure, making it not viable to backport. Furthermore, renaming the constraints is not viable as a workaround, as renaming constraints is not supported in 5X. Thus, this commit adds gp_enable_drop_key_constraint_child_partition, a GUC that provides the user a way to drop the primary/unique key constraint directly from the partition children. Note: the GUC does not apply to: i) CHECK constraints - they don't have the naming inconsistency ii) FOREIGN KEY constraints - they are not auto-created on partition children. iii) NOT NULL constraints = they are not auto-created on partition children. Furthermore, the GUC should not be necessary for primary/unique key constraints added at table creation time - as any constraints added at table creation time don't suffer from the naming inconsistency described above. Thus dropping such constraints from the root should automatically drop the constraints from the children. Co-authored-by: Jimmy Yih <jyih@vmware.com>
Our getnameinfo() replacement implementation in getaddrinfo.c failed unless NI_NUMERICHOST and NI_NUMERICSERV were given as flags, because it doesn't resolve host names, only numeric IPs. But per standard, when those flags are not given, an implementation can still degrade to not returning host names, so this restriction is unnecessary. When we remove it, we can eliminate some code in postmaster.c that apparently tried to work around that. Backported from upstream Postgres: postgres/postgres@c424d0d There were some minor merge conflicts because of a previous backport: https://github.com/greenplum-db/gpdb/commit/a9768e32a0a310bd7e7a45c089898542670bf2c2
…too. Per testing of previous patch. Backported from upstream Postgres: postgres/postgres@1997f34
There was a regression caused by a previous incomplete backport that made it so that invalid WARNING logs were being logged when log_hostname GUC was enabled and a user connects via AF_UNIX socket. As part of completing the backport to fix the issue, a change was made to getnameinfo_unix() to remove an unnecessary restriction. Add simple CMockery unit test to prevent future regressions. Example of WARNING log message: pg_getnameinfo_all() failed: Non-recoverable failure in name resolution Incomplete backport GPDB commit: https://github.com/greenplum-db/gpdb/commit/a9768e32a0a310bd7e7a45c089898542670bf2c2 Related missing upstream Postgres patch that fixes the issue: postgres/postgres@1997f34
The ip.t CMockery unit test was failing on SuSE because the sockaddr_un struct is different on SuSE compared to other Linux operating systems. Fix the issue by just using the sockaddr_un struct directly to set the sun_path. GPDB commit reference: https://github.com/greenplum-db/gpdb/commit/fc13ac07b59a33a80ff35d6a1d1dccf17ff69cea
Stolb27
approved these changes
Nov 22, 2021
Collaborator
There was a problem hiding this comment.
@maksm90, please take a look in the morning. I've requested disabling building on power for adb-5.x. Failed isolation test is a known issue that is going to be resolved in https://arenadata.atlassian.net/browse/ADBDEV-2154.
maksm90
approved these changes
Nov 22, 2021
Collaborator
|
@deart2k, it's ready to merge |
RekGRpth
added a commit
that referenced
this pull request
Mar 2, 2026
There was a 11 years old bug in the python function fetching postmaster
PID for remote host. The bash command had unescaped $ symbol. As a
result part of awk command was interpolated as bash argument before ssh.
So awk do nothing (`awk {print }`), full ps output grepped by pid and
any part of string could match, e.g. postmaster port. Moreover, even PID
could be matched as a substring for longest PIDs. The last problem
affected function for local search too. So sometimes gprecoverseg could
kill arbitrary postgres processes on the same segment host.
We simplify bash command with proper escaping and cover these functions
by unit tests
Ticket: GG-227
Co-authored-by: Georgy Shelkovy <g.shelkovy@arenadata.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here are some reminders before you submit the pull request
make installcheck