Conversation
In 5X a snapshot satisfies "now" was used when transaction extracted AO/AOCO segment file eof. In other words "now" allowed us to see all committed transactions. When this code was ported to 6X a catalog snapshot was chosen for this purpose. As a result we don't see committed transactions that started after the snapshot has been taken. As a result a transaction can change segment file eof between the moment a snapshot was taken and an exclusive file segment lock was hold. So AO/AOCO in 6X suffers from data corruption in concurrent DML queries. This is the first commit demonstrating the problem (all added tests should fail).
Previous tests showed how to get the AO/AOCO data corruption in a mixed mode - cluster-wide queries were mixed with utility mode. New test prove that we get the same corruption on concurrent cluster- wide queries as well.
In 5X a snapshot satisfies "now" was used when transaction extracted AO/AOCO segment file eof. In other words "now" allowed us to see all committed transactions. When this code was ported to 6X a catalog snapshot was chosen for this purpose. As a result we don't see committed transactions that started after the snapshot has been taken. As a result a transaction can change segment file eof between the moment a snapshot was taken and an exclusive file segment lock was hold. So AO/AOCO in 6X suffers from data corruption in concurrent DML queries. This commit fixes the problem using snapshot self instead of catalog snapshot - it allows us to see all committed data as "now" in 5X did.
Added a new test to demonstrate that we don't have anomalies after we prepared a two phase transaction on QE when use SnapshotSelf.
Wrong snapshot for EOF corrupts data in AO/AOCO
…rupt its execution (#198) Fixes the case when the server is executing a lengthy query and the client breaks the connection. The operating system will be aware that the connection is no more, but postgres node doesn't notice this, because it doesn't try to read from or write to the socket while running query. So we'll get a zombie connection. In theory, the query could be one that runs for a million years, continues to chew up CPU and I/O and occupies a connection slot that's sad. Worse still, a sent query might be modifiable and not return any data, then it might be surprising for disconnected client that his previously sent modification will be accepted at some point later - at completion of execution. For these reasons, the query have to be interrupted as much earlier as possible. The patch provides a new GUC check_client_connection_interval that can be used to periodically check via CLIENT_CHECK_CONNECTION_TIMEOUT interrupts whether the client connection has gone away, while running very long queries. It is disabled by default. For non-locking check of socket state the patch uses a non-standard Linux extension (also adopted by at least one other OS) - POLLRDHUP option that is not defined by POSIX. Backport from PostgreSQL commits: - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c30f54ad732ca5c8762bb68bbe0f51de9137dd72 - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=22f6f2c1ccb56e0d6a159d4562418587e4b10e01
Execution of multilevel correlated queries with high level of nesting can cause segfault(when using array_agg, json_agg) or can provide wrong results (when using classic aggs like sum()). Due to some GP limitations, correlated subqueries with skip-level correlations are not supported. Additional check condition is provided to prevent such queries from planning. QueryHasDistributedRelation function, used by this check, doesn't recurse over subplans and may return wrong results for distributed RTE_RELATION entries hided by RTE_SUBQUERY entries. Commit fixes such behavior by adding optional recursion to QueryHasDistributedRelation function. Additional regression test is included. Additional information can be found at issue #12054.
InnerLife0
approved these changes
Jul 9, 2021
hilltracer
pushed a commit
that referenced
this pull request
Mar 6, 2026
- test_it_retries_the_connection: use mock object that support context managment - GpArrayTestCase: use bool type instead str 't'/'f' - GpCheckCatTestCase: check connection in DbWrapper. - DifferentialRecoveryClsTestCase and GpStopSmartModeTestCase: mock GgdbCursor to return connection. - RepairMissingExtraneousTestCase and UniqueIndexViolationCheckTestCase: use python arrays instead of string representation of Postgres arrays. Also fix seg ids set in get_segment_to_oid_mapping. Since seg ids in issues are now ints, we do not need to cast all_seg_ids array elements to strings.
Stolb27
added a commit
that referenced
this pull request
Mar 10, 2026
The following points have been fixed: 1. PyGreSQL 5 has added support for converting additional data types. Analyzedb: Converting datetime to a string for correct comparison with the value saved in the file. el8_migrate_localte.py, gparray.py, gpcatalog.py and gpcheckcat: using the Bool type instead of comparing with a string. gpcheckcat, repair_missing_extraneous.py and unique_index_violation_check: using python list instead of string parsing. 2. PyGreSQL 5 added support for closing a connection when using the with construct. Because of this, in a number of places, reading from the cursor took place after the connection was closed. 3. PyGreSQL 5 does not end the transaction if an error occurs, which leads to a possible connection leak if an error occurs in the connect function. So catch errors that happen in the connect function. 4. Add closure of the connection saved in context after the scenario in behave tests. 5. Add closure to the connection if it does not return from the function. 6. Use the python wrapper for the connect function instead of C one. 7. Use a custom cursor to disable row postprocessing to avoid correcting a large amount of code. 8. Fix the bool and array format in isolation2 tests. 9. Add notifications processing to isolation2 tests. 10. Also fix the notifications processing in the resgroup_query_mem test. 11. Fix the notifications processing in gpload. 12. Fix pg_config search when building deb packages. 13. Fix gpexpand behave tests (#176) The previous commit added a few regressions. The regression was related to replacing the comparison condition of the comparison with 't' with a truth check. This change is due to the fact that in PyGreSQL 5, unlike the 4th version, it converts the bool values. But it was not taken into account that such values can be set in Python code. The error of calling verify in TestDML has also been fixed. The verify method was called without passing a connection, and although the verify implementation in the class itself does not require a connection, this function may be overloaded in a child class. 14. Fix PyGreSQL install to be compatible with both python versions (#183) PyGreSQL install works in Python 2 but breaks in Python 3 because the _pg extension must be importable as a top-level module (e.g. from _pg import *). Python 3 resolves extension modules via sys.path, so _pg*.so has to be located at the sys.path root, not only inside the pygresql/ package directory. Move _pg*.so from pygresql directory to the top-level, so the same install layout works for both Python versions. Update _pg*.so RPATH to match its installed location so dpkg-shlibdeps can resolve libpq.so during Debian packaging. 15. Fix Python unit tests after PyGreSQL update (#222) - test_it_retries_the_connection: use mock object that support context managment - GpArrayTestCase: use bool type instead str 't'/'f' - GpCheckCatTestCase: check connection in DbWrapper. - DifferentialRecoveryClsTestCase and GpStopSmartModeTestCase: mock GgdbCursor to return connection. - RepairMissingExtraneousTestCase and UniqueIndexViolationCheckTestCase: use python arrays instead of string representation of Postgres arrays. Also fix seg ids set in get_segment_to_oid_mapping. Since seg ids in issues are now ints, we do not need to cast all_seg_ids array elements to strings. 16. Move PyGreSQL code to submodule (#269) It would be nice to avoid patching this module. Also this patch fixes Greengage installation scripts for PyGreSQL to support non-root release builds over DESTDIR. It was a problem of Greengage, non PyGreSQL. Additionally, include the new PyGreSQL license to the NOTICE file 17. Fix minirepro and gpsd utility for PyGreSQL-5.2.5 (#291) Both utils used outdated version of method pgdb.connect(). The patch changes the way pgdb.connect() is used by avoiding usage of parameter which later gets parsed. Instead both utils now use parameters of the same names. Co-authored-by: Denis Garsh <d.garsh@arenadata.io> Co-authored-by: Vasiliy Ivanov <ivi@arenadata.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR to 6X master to test fixes together.
Fixed: