Avoid using pg_locks with distributed tables by seryozhasmirnov · Pull Request #291 · arenadata/gpdb

seryozhasmirnov · 2021-12-27T10:03:42Z

In upstream Postgres, pg_locks exposes part of the lock manager so that
DBA's can inspect locks taken by various backends. In Greenplum, we
modified pg_lock_status() -- the function that underlies pg_locks -- to

a) provide additional Greenplum-specific information (e.g.
mppsessionid); and

b) aggregate the locks from master and all primary segments.

One consequence of the implementation we chose to achieve point b above,
is that queries that involve both pg_locks and a distributed table won't
work. If you're lucky (planner or ORCA places the function call in the
top slice), it "won't work" and throws an error like this at you:

ERROR: query plan with multiple segworker groups is not supported

If you're not lucky (planner or ORCA schedules the function call on a
different slice running on master), it most likely secretly doesn't do
point b and only returns locks from the master.

Before we fix pg_locks, rewrite the isolation2 test case "starve_case"
to separate the repeated queries in pg_locks from the main query. This
does logically the same thing in PL/pgSQL but it's now safe.

This commit also adds starve_case back to the isolation2 schedule.

(cherry picked from commit 2ff0127)

Here are some reminders before you submit the pull request

Add tests for the change
Document changes
Communicate in the mailing list if needed
Pass make installcheck
Review a PR in return to support the community

src/backend/optimizer/util/clauses.c

In upstream Postgres, pg_locks exposes part of the lock manager so that DBA's can inspect locks taken by various backends. In Greenplum, we modified pg_lock_status() -- the function that underlies pg_locks -- to a) provide additional Greenplum-specific information (e.g. mppsessionid); and b) aggregate the locks from master and all primary segments. One consequence of the implementation we chose to achieve point b above, is that queries that involve both pg_locks and a distributed table won't work. If you're lucky (planner or ORCA places the function call in the top slice), it "won't work" and throws an error like this at you: ERROR: query plan with multiple segworker groups is not supported If you're not lucky (planner or ORCA schedules the function call on a different slice running on master), it most likely secretly doesn't do point b and only returns locks from the master. Before we fix pg_locks, rewrite the isolation2 test case "starve_case" to separate the repeated queries in pg_locks from the main query. This does logically the same thing in PL/pgSQL but it's now safe. This commit also adds starve_case back to the isolation2 schedule. (cherry picked from commit 2ff0127)

For a PL/pgSQL function like the following: set optimizer_trace_fallback to on; CREATE OR REPLACE FUNCTION boom() RETURNS bool AS $$ DECLARE mel bool; sesh int[]; BEGIN sesh := '{42,1}'::int[]; -- query 1 select c = ANY (sesh) INTO mel FROM (values (42), (0)) nums(c); -- query 2 return mel; END $$ LANGUAGE plpgsql VOLATILE; SELECT boom(); With Orca enabled, the database crashes. Starting in 9.2, PL/pgSQL supplies bound param values in more statement types to enable planner to fold constants in more cases. This is in contrast to leaving the param intact and waiting until execution to substitute it with its values. Previously, only dynamic execution ("EXECUTE 'SELECT $1' USING sesh") gets this treatment. This revealed the bug because Orca would not have been able to plan queries whose query trees included params that were not in subplans (external params) and would just fall back. When query 1 is planned, it is translated into select '{42,1}'::int[]; For uninteresting reasons, the planner-produced plan for query 1 is considered "simple", and the ORCA-produced plan is considered regular (not simple). PL/pgSQL has a fast-path for "simple" plans, minimally starting the executor via `ExecEvalExpr`. Regular plans are executed through SPI. During execution, SPI will pack (as part of `heap_form_tuple`) the 4-byte header datum into a 1-byte header datum. While planning query 2, we will attempt to substitute the param "sesh" with the actual const value during pre-processing. Since Orca doesn't recognize const arrays as arrays, the translator will take the additional step of translating the const into an array expression. When accessing the array-typed const, we need to "unpack" (`DatumGetArrayTypeP`) the datum. This commit does that. Co-authored-by: Melanie Plageman <mplageman@pivotal.io> (cherry picked from commit c417d2b)

Both utils used outdated version of method pgdb.connect(). The patch changes the way pgdb.connect() is used by avoiding usage of parameter which later gets parsed. Instead both utils now use parameters of the same names.

The following points have been fixed: 1. PyGreSQL 5 has added support for converting additional data types. Analyzedb: Converting datetime to a string for correct comparison with the value saved in the file. el8_migrate_localte.py, gparray.py, gpcatalog.py and gpcheckcat: using the Bool type instead of comparing with a string. gpcheckcat, repair_missing_extraneous.py and unique_index_violation_check: using python list instead of string parsing. 2. PyGreSQL 5 added support for closing a connection when using the with construct. Because of this, in a number of places, reading from the cursor took place after the connection was closed. 3. PyGreSQL 5 does not end the transaction if an error occurs, which leads to a possible connection leak if an error occurs in the connect function. So catch errors that happen in the connect function. 4. Add closure of the connection saved in context after the scenario in behave tests. 5. Add closure to the connection if it does not return from the function. 6. Use the python wrapper for the connect function instead of C one. 7. Use a custom cursor to disable row postprocessing to avoid correcting a large amount of code. 8. Fix the bool and array format in isolation2 tests. 9. Add notifications processing to isolation2 tests. 10. Also fix the notifications processing in the resgroup_query_mem test. 11. Fix the notifications processing in gpload. 12. Fix pg_config search when building deb packages. 13. Fix gpexpand behave tests (#176) The previous commit added a few regressions. The regression was related to replacing the comparison condition of the comparison with 't' with a truth check. This change is due to the fact that in PyGreSQL 5, unlike the 4th version, it converts the bool values. But it was not taken into account that such values can be set in Python code. The error of calling verify in TestDML has also been fixed. The verify method was called without passing a connection, and although the verify implementation in the class itself does not require a connection, this function may be overloaded in a child class. 14. Fix PyGreSQL install to be compatible with both python versions (#183) PyGreSQL install works in Python 2 but breaks in Python 3 because the _pg extension must be importable as a top-level module (e.g. from _pg import *). Python 3 resolves extension modules via sys.path, so _pg*.so has to be located at the sys.path root, not only inside the pygresql/ package directory. Move _pg*.so from pygresql directory to the top-level, so the same install layout works for both Python versions. Update _pg*.so RPATH to match its installed location so dpkg-shlibdeps can resolve libpq.so during Debian packaging. 15. Fix Python unit tests after PyGreSQL update (#222) - test_it_retries_the_connection: use mock object that support context managment - GpArrayTestCase: use bool type instead str 't'/'f' - GpCheckCatTestCase: check connection in DbWrapper. - DifferentialRecoveryClsTestCase and GpStopSmartModeTestCase: mock GgdbCursor to return connection. - RepairMissingExtraneousTestCase and UniqueIndexViolationCheckTestCase: use python arrays instead of string representation of Postgres arrays. Also fix seg ids set in get_segment_to_oid_mapping. Since seg ids in issues are now ints, we do not need to cast all_seg_ids array elements to strings. 16. Move PyGreSQL code to submodule (#269) It would be nice to avoid patching this module. Also this patch fixes Greengage installation scripts for PyGreSQL to support non-root release builds over DESTDIR. It was a problem of Greengage, non PyGreSQL. Additionally, include the new PyGreSQL license to the NOTICE file 17. Fix minirepro and gpsd utility for PyGreSQL-5.2.5 (#291) Both utils used outdated version of method pgdb.connect(). The patch changes the way pgdb.connect() is used by avoiding usage of parameter which later gets parsed. Instead both utils now use parameters of the same names. Co-authored-by: Denis Garsh <d.garsh@arenadata.io> Co-authored-by: Vasiliy Ivanov <ivi@arenadata.io>

seryozhasmirnov marked this pull request as ready for review December 30, 2021 20:16

seryozhasmirnov requested a review from a team December 30, 2021 20:16

InnerLife0 reviewed Jan 10, 2022

View reviewed changes

src/backend/optimizer/util/clauses.c Outdated Show resolved Hide resolved

d and others added 2 commits January 11, 2022 09:24

seryozhasmirnov force-pushed the ADBDEV-2154 branch from e4e7960 to 89a40df Compare January 11, 2022 06:27

InnerLife0 self-requested a review January 11, 2022 08:22

InnerLife0 approved these changes Jan 11, 2022

View reviewed changes

Stolb27 merged commit 2f87aa6 into adb-5.x Jan 11, 2022

Stolb27 deleted the ADBDEV-2154 branch January 11, 2022 23:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid using pg_locks with distributed tables#291

Avoid using pg_locks with distributed tables#291
Stolb27 merged 2 commits intoadb-5.xfrom
ADBDEV-2154

seryozhasmirnov commented Dec 27, 2021

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

seryozhasmirnov commented Dec 27, 2021

Here are some reminders before you submit the pull request

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants