Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync 6.26.2 changes to dev branch #891

Merged
merged 69 commits into from
Mar 15, 2024
Merged

Sync 6.26.2 changes to dev branch #891

merged 69 commits into from
Mar 15, 2024

Conversation

Stolb27
Copy link
Collaborator

@Stolb27 Stolb27 commented Mar 15, 2024

No description provided.

Lisa Owen and others added 30 commits November 16, 2023 13:46
* docs - pgbouncer supports encrypted ldap password

* remove the slash

* add statement about FIPS mode to auth_ciper

* do not need the the
* Added rsync version requirement for gpstate -e

* note formatting

---------

Co-authored-by: Mireia Perez Fuster <mperezfuster@mperezfust5MD6R.vmware.com>
* Docs: added new page to upgrade EL 7 to EL 8

* Modifications

* Modifications

* mods

* mods

* Modifications after review

* Changes from review

* Added note to take backup before in-place upgradE

* Changes from Kris's review

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Update gpdb-doc/markdown/install_guide/migrate-linux.html.md

Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>

* Add note for newly-added info from Kris.

* small formatting fix

---------

Co-authored-by: Mireia Perez Fuster <mperezfuster@mperezfust5MD6R.vmware.com>
Co-authored-by: David Yozie <dyozie@pivotal.io>
Co-authored-by: Kris Macoskey <kmacoskey@gmail.com>
This is a backport of 7X commit:
99a48e6 with one very notable change:
the GUC gp_gang_creation_retry_non_recovery controls whether we want to
retry on non-recovery failures, giving users a chance to opt out of the
behavior.

Original commit message follows:

Before this commit, any error from a segment that is unrelated to
recovery, would result in an immediate termination with the following:

ereport(ERROR, (errcode(ERRCODE_GP_INTERCONNECTION_ERROR),
	errmsg("failed to acquire resources on one or more segments"),
	errdetail("%s (%s)", PQerrorMessage(segdbDesc->conn), segdbDesc->whoami)));

In other words, this termination did not respect
gp_gang_creation_retry_count. From now on, we exhaust the retries first,
before emitting this ERROR.

Co-authored-by: Ashwin Agrawal <aashwin@vmware.com>
Reviewed-by: Xin Zhang <xinzweb@users.noreply.github.com>
Reviewed-by: Huansong Fu <fuhuansong@gmail.com>
Backport from #16675. No merge conflict.

Original commit message:

If there is an explicit BEGIN, and if the transaction only has SET command, we
will begin transaction and setup DTX context on QEs at the time of the first
SET command. So we need to add dtxSegments in SET to make sure we end the
transaction at the time of END/COMMIT. This is required since b43629b
where we no longer start DTX at the time of BEGIN.

However, we should not do this when there's NO explicit BEGIN. Because in that
case, QEs will not have DTX context setup with the SET command, so if we decide
to send DTX protocol command later (e.g. we have some direct-dispatch command
that requires DTX with part of the QEs), we will send DTX protocol command to
ALL dtxSegments. As a result, some QE might fail to recognize the DTX protocol
command and throw error. E.g.:

	create table t1(a int, b int);

	CREATE OR REPLACE FUNCTION dtx_bug()
	  RETURNS void
	  LANGUAGE plpgsql
	AS $function$
	BEGIN
	    EXECUTE 'UPDATE t1 set b = 1 where a = 1;';
	    set optimizer=off;
	END;
	$function$;

	postgres=# SELECT dtx_bug();
	ERROR:  Distributed transaction 16607 not found (cdbtm.c:2185)  (seg2 127.0.1.1:7004 pid=32375) (cdbtm.c:2185)

Fix the case by skipping adding dtxSegments when w/o explicit BEGIN/END. Also
fix the comment in code and add test cases.
The test was added in greenplum-db/gpdb@b055a88.
Some query CONTEXT output is missing the output. Added now.
The fix is to remove the calling of ereport() in StatementCancelHandler(), and
remove elog() in ProcessInterrupts(), to avoid calling ereport()/elog() in signal
handler which may cause deadlock.
Files for skipping dumping symbols and types may not exist. We should
allow the absence of these files.

Also, the length of the symbol `ConfigureNamesBool_gp` is changed in
12423cd and it's a safe abi breaking change. This patch helps add it to
the `.abi-check/6.26.0/postgres.symbols.ignore` list.
When user try to select a big number of rows via PSQL, they may encounter the following error:
ERROR: "Cannot add cell to table content: total cell count of XXX exceeded".
The root cause is the overflow of int type in printTableAddCell(), this commit fixes it.

And it has already been fixed it in 7x and postgres:
- greenplum-db/gpdb#16388
- postgres/postgres@24ea53d
This commit fixes the issue, where the `gpstop` errors out when there is an empty PID file in the `gpstop.lock` directory.
```
20231013:03:18:29:033818 gpstart:mdw:gpadmin-[CRITICAL]:-Error occurred: non-zero rc: 1
 Command was: '$GPHOME/bin/gpstop -a -m -M immediate -d /data1/master/gpseg-1 -v -B 64'
rc=1, stdout='', stderr='Traceback (most recent call last):
  File "/usr/local/greenplum-db-6.25.3/bin/gpstop", line 1098, in <module>
    simple_main(GpStop.createParser, GpStop.createProgram, GpStop.mainOptions())
  File "/usr/local/greenplum-db-6.25.3/lib/python/gppylib/mainUtils.py", line 263, in simple_main
    simple_main_internal(createOptionParserFn, createCommandFn, mainOptions)
  File "/usr/local/greenplum-db-6.25.3/lib/python/gppylib/mainUtils.py", line 288, in simple_main_internal
    otherpid = sml.acquire()
  File "/usr/local/greenplum-db-6.25.3/lib/python/gppylib/mainUtils.py", line 165, in acquire
    self.pidfilepid = self.pidlockfile.read_pid()
  File "/usr/local/greenplum-db-6.25.3/lib/python/gppylib/mainUtils.py", line 102, in read_pid
    owner = int(p.read())
ValueError: invalid literal for int() with base 10: ''
Exception ValueError: ValueError("invalid literal for int() with base 10: ''",) in <bound method PIDLockFile.__del__ of <gppylib.mainUtils.PIDLockFile instance at 0x7f11fed9b0a0>> ignored
```

Due to the empty file, whenever it reads and tries to convert the contents to an integer, we get a `ValueError` exception. With this change, we will now automatically delete the lock directory in such situations so that `gpstop` can go ahead and create a new lock directory.

Another issue was that, even though we delete the lock directory at the end of the process execution, it does not guarantee that it is completely deleted at the disk level. This issue came up when `gpsnap` created a snapshot after running `gpstop` and when the data was restored back using `gpsnap restore`, we saw that the `gpstop` lock directory reappeared with an empty PID file indicating that the directory was not completely removed.

So to guarantee that the changes are written on to the disk, a system call for `sync` has been added after the lock directory is removed. Refer [here](https://stackoverflow.com/questions/7127075/what-exactly-is-file-flush-doing) for more info on how python handles file operations.
because these platform support system libzstd 1.4.4

[GPR-1624]

Authored-by: Anusha Shakarad <shakarada@vmware.com>
Current check order of reserve memory for resource group: 1. reserve
memory from the resource group slot 2. if the slot quota exceeded,
reserve memory from the group shared 3. then reserve memory from the
global free chunk.

If group 2 over use the global freeChunks, even group 1 has enough
group share free space to alloc, it will still check global freeChunks
value and get MemoryFailure_ResourceGroupMemoryExhausted error.

Check global freeChunks only when group shared memory are over used.
…6799)

Currently, Query fallsback to the planner when a subquery is present in
a ScalarMinMax (LEAST, or GREATEST) expression. In This PR, I am
addressing the situation by incorporating the ScalarMinMax operator when
processing scalar operators with subqueries.

Support subquery within ScalarArrayRef/ScalarArrayListIndex
…(#16800)

ORCA generated incorrect plan while doing "Left Semi Join with replicated outer table" .
    Duplicated rows were observed in the output. In the plan, combination of
    'ctid, gp_segment_id' was used as a 'key' for the replicated outer table.

    Since for a replicated table, same data is present across segments, this key
    does not help in uniquely identifying any tuple. Thus this combination should not
    be added as a key. Addition is prevented by not including it, while retreiving key
    sets during 'Relcache to DXL' translation.

Co-authored-by: nishant sharma <nishants4@nishants46M6RY.vmware.com>
In a followup commit ExecProcNode(), and especially the large switch
it contains, will largely be replaced by a function pointer directly
to the correct node. The node functions will then get invoked by a
thin inline function wrapper. To avoid having to include miscadmin.h
in headers - CHECK_FOR_INTERRUPTS() - move the interrupt checks into
the individual executor routines.

While looking through all executor nodes, I noticed a number of
arguably missing interrupt checks, add these too.

Author: Andres Freund, Tom Lane
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/22833.1490390175@sss.pgh.pa.us
Backported-by: Hovhannes Tsakanyan <hovhannes.ht@gmail.com>
(back ported from commit d47cfef)
**Issue:**
gpcheckcat reports `column "oid" does not exist` error while running ‘missing_extraneous’ test on pg_shdescription catalogue table.
This is because internally 'missing_extraneous' test of gpcheckcat tries to fetch the oid of all the catalogue tables present in the database.
For pg_shdescription table, the oid was changed to objoid in 6x and 7x.
So there is no attribute as oid in pg_shdescription table.
Similar issue will be seen for other catalogue tables wherever the 'attname' field of pg_attribute table is objoid.
These tables are as below:
- pg_seclabels
- pg_description
- pg_shdescription
- pg_seclabel
- pg_shseclabel

Issue with pg_description table is fixed as part of [#16140 ](greenplum-db/gpdb#16140)
This PR is used to track the fix for other catalog tables listed above.

**Fix:**
Catalog tables which do not have a OID column are stored in a dictionary 'TableMainColumn' and are mapped an existing OID type attribute. An entry for the required catalog tables is added to dictionary so that objoid is used instead of oid while running SQL queries.

**Test:**
A behave test is added for pg_shdescription table.
Previous coding in get_const_expr() has handled with negative number,
opexpr plus number is an exception. (ex. -1 could be a negative const
and also could be a opexpr '-' plus number 1). So we needed to hacking
up in get_rule_sortgroupclause() for Opexpr cases and pre-calculate
opexpr plus number to real const by expression_planner().

authored-by: chaotian <chaotian@vmware.com>
The upstream of pgbouncer has fixed a lot of crash issues. So it is better to upgrade pgbouncer to the latest version before customers reporting some crash issues in the future.
Also the latest branch of pgbouncer has set submodule libusual to upstream repo:https://github.com//libusual/libusual.
And another submodule is added to pgbouncer which is uthash to repo:https://github.com/troydhanson/uthash.git.
GPDB does not support passing params by a motion.
If we pass params by a motion. It will cause panic.
```
create table ttt(tc1 varchar(10)) distributed randomly;
create table ttt1(tc2 varchar(10)) distributed randomly;
insert into ttt values('sdfs');
insert into ttt1 values('sdfs');
  
explain (costs off)
select 
   ttt.*,  t.t1 
from 
  ttt 
  left join lateral (
    select 
      string_agg(distinct tc2, ';') as t1 
    from 
      ttt1 
    where 
      ttt.tc1=ttt1.tc2
 ) t on true;
  
  --------------------------------------------------------------------------------------
--------
 Nested Loop Left Join  (cost=10000000001.06..10000000002.12 rows=4 width=37)
   ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..1.03 rows=1 width=5)
         ->  Seq Scan on ttt  (cost=0.00..1.01 rows=1 width=5)
   ->  Materialize  (cost=1.06..1.08 rows=1 width=32)
         ->  Aggregate  (cost=1.06..1.07 rows=1 width=32)
               ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..1.05 rows=1 w
idth=5)
                     ->  Seq Scan on ttt1  (cost=0.00..1.01 rows=1 width=5)
                           Filter: ((ttt.tc1)::text = (tc2)::text)
 Optimizer: Postgres query optimizer
```

Add a walker in create_nestloop_plan to check whether extParam of a motion
in the rightPlan contains any NestLoopParam. 
In that case, we throw an error instead of panic.
…16797) (#16851)

This fix addresses a memory leak that occurs during the execution of DynamicIndexScan
and DynamicBitmapIndexScan. Each partition of the DynamicIndexScan creates an IndexScanState
for execution usage. Within each IndexScanState, an ExprContext is allocated but not freed
in ExecFreeExprContext(). This ExprContext includes a MemoryContext, resulting in a memory
leak of the initial context space during each iteration of the DynamicIndexScan partition loop.
Which will lead to a significant accumulation of leaked memory if the partition loop is extensive.
The same issue is observed in the execution of DynamicBitmapIndexScan.

This fix ensures that the allocated MemoryContext within the ExprContext is properly freed during
the execution of each partition in Dynamic Index/BitmapIndex Scan.

Authored-by: wuyuhao28 <wuyuhao28@github.com>
The new column shows how many backends have a buffer pinned. That can
be useful during development or to diagnose production issues
e.g. caused by vacuum waiting for cleanup locks.

To handle upgrades transparently - the extension might be used in
views - deal with callers expecting the old number of columns.

Reviewed by Fujii Masao and Rajeev rastogi.

(cherry picked from commit f577919)
That makes the view a lot less disruptive to use on a production system.
Without the locks, you don't get a consistent snapshot across all buffers,
but that's OK. It wasn't a very useful guarantee in practice.

Ivan Kartyshov, reviewed by Tomas Vondra and Robert Haas.

Discusssion: <f9d6cab2-73a7-7a84-55a8-07dcb8516ae5@postgrespro.ru>
(cherry picked from commit 6e65454)
We can't check the output of this view very closely without
creating portability headaches, but we can make sure that
the number of rows is as-expected.  In any case, this is
sufficient to exercise all the C code within, which is a
lot better than the 0% coverage we had before.

DongWook Lee

Discussion: https://postgr.es/m/CAAcByaLCHGJB7qAENEcx9D09UL=w4ma+yijwF_-1MSqQZ9wK6Q@mail.gmail.com
(cherry picked from commit be39d88)
(cherry picked from commit a987b76)
This commit builds and packages pg_buffercache 1.1 while
adding the MPP view gp_buffercache to the extension.

Note that 1.1 is the maximum compatible version for GP6 due to
significant changes made to buffer descriptors and buffer locking
semantics introduced upstream in 9.5/9.6.

Ref:
postgres/postgres@ed12700
postgres/postgres@4835458

Also adds basic tests for the views.
`AclObjectKind` has no enum for view, just change the output file.
Missed a resource queue notice.
Fix issue after greenplum-db/gpdb#16840
Remove the following codes:
if (!is_plan_node(node))
    return false;
There is a situation where the node is not a plan node, but its children may be
plan nodes, we should continue to walk its children recursively.
```
Nested Loop
   ->  Gather Motion 1:1  (slice1; segments: 1)
         ->  Seq Scan on t1
   ->  Materialize
         ->  Unique
               Group Key: t2.a, t2.b
               ->  Sort
                     Sort Key (Distinct): t2.a, t2.b
                     ->  Append
                           ->  Gather Motion 1:1  (slice2; segments: 1)
                                 ->  Seq Scan on t2
                                       Filter: ((t1.b).x = (b).y)
                           ->  Gather Motion 1:1  (slice3; segments: 1)
                                 ->  Seq Scan on t1 t1_1
                                       Filter: ((a + 1) < 10)
```
see from above plans.
Gather motions are in the appendplans of Append and the appendplans is type of List*.
If we add above codes, is_plan_node(List) returns false, and we return false,
this will cause Seq Scan on t2 can not be processed recursively.
Stolb27 and others added 21 commits March 14, 2024 09:27
…he queries_now_fast external table (#636)"

to solve conflicts with 08d747e

This reverts commit 742bf6f.
to solve conflicts with 08d747e

This reverts commit fc2aade.
apply 9290a5c to our custom power answer file

git show 9290a5c -- src/test/regress/expected/subselect_gp.out >
    /tmp/subselect.patch
patch -p1 --merge src/test/regress/expected/subselect_gp_1.out <
    /tmp/subselect.patch

Co-authored-by: Hari krishna <49222145+pobbatihari@users.noreply.github.com>
Geenplum python scrips try to parse system command’s STDOUT and STDERR
in English and may fail if locale is different from en_US. This patch
adds tests that cover this case

Also, this patch reinstalls glibc-common in docker container. This is
necessary to get langpacks in docker because docker images don't contain
them.

Cherry-picked-from: 6298e77
(cherry picked from commit fc2aade)
…es_now_fast external table (#636)

gpperfmon writes a string to queries_now.dat in the expectation that it will be
processed by gpmon_catqrynow.py, but the queries_now_fast external table reads
from the file directly. This leads to an error, because queries_now.dat contains
the query_hash column that is not declared in the queries_now_fast table. Also
queries_now.dat contains empty columns at the end that should be filled by
python script. These columns are not declared in the table too.

This patch fixes this issue by removing extra columns from queries_now.dat and
adding them directly in the python script and in the write_qlog_full (write into
queries_tail.dat) function.

Co-authored-by: Andrey Sokolov <a.sokolov@arenadata.io>
(cherry picked from commit 742bf6f)
It repalced the hard-coded define MAX_SCAN_ON_SHMEM.
Specifies the limit of maximum scan node's instrumentations per query in shmem.
If table has many partitions, Postgres planner will generate a plan with many SCAN
nodes under a APPEND node. If the number of partitions are too many, this plan will
occupy too many slots. Here is a limitation on number of shmem slots used by scan
nodes for each backend. Instruments exceeding the limitation are allocated in local memory.

(cherry picked from commit 6425610)
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
The call of the cdbpullup_findEclassInTargetList function takes place when
one need to check whether the given EquivalenceClass contains exprs that
can be expressed in terms of the given targetlist. This commit changes the way
how equivalence class is iterated through in order to find a eclass member,
which is already inside the targetList or uses no Vars that are not in
targetList. Previous code returned the first member it found suitable: if the
current member is const or is equal to targetList entry, it will be returned
immediately, otherwise we check whether the member has non-Var type and is not
inside targetList, and return the member or continue the iteration process. That
logic is optimal in most cases, however, there are the cases when that logic
brings erroneous behaviour. For example, in queries containing parameterized
subqueries (like the one presented in the tests) it may lead to passing params
across motion nodes, which may create the situation when these params are being
referenced in different slices, and it probably could lead to segfault when the
executor is accessing its attribute list.

In the query shown in test section such situtation occurs when the Flow node's
hashExprs list is filled during join inner plan creation for scan node. Because
subquery has WHERE clause, the EquivalenceClass for subquery's relation
distribution key contains two members: one is of type T_Param and other
corresponds to distirbution key. When hashExprs list is being filled the
cdbpullup_findEclassInTargetList function is called, and for the given eclass
and targetList it returns the Param member. This Param is returned because
EquivalenceClass member holding it has em_is_const flag on, for situation like
the one in the tests the flag is set anyway, because WHERE operand refers to
outer relations. The function under discussion returns the Param because it
came across the appropriate member without checking targetList. Eventually
this Param entry is appended to hashExprs and later to Seq Scan's targetList
from hashExprs, what leads to invalid nested loop plan and possible segfault
during execution step.

The proposed solution makes cdbpullup_findEclassInTargetList delay the return
of consts and params in order to find a member, which is equal to targetList
entry. This approach solves the issue with param passage, because for such
situations it is guaranteed that plan's targetList contains the distribution
key, and if we iterate through eclass and targetList first, the member
corresponding to distribution key will be returned from the function instead
of the Param. This approach does not break the function's original purpose,
because the entires of eclass may come in arbitrary order.

Cherry-picked-from: db2d388
to solve conflicts with: 31db276

(cherry picked from commit 93d4623)
Flow structure has a segidColIdx field, which referencing a gp_segment_id
column in plan's targetList when the Explicit Redistribute Motion is requested.
There was a problem, that _copyFlow, _equalFlow, _outFlow and _readFlow
functions did not handle the segidColIdx field. Therefore, the Flow node
could not be serialized and deserialized, or copied correctly.

The problem manifested itself when a query had UPDATE/DELETE operation, that
required Explicit Redistribute, inside the SubPlan (or InitPlan). The Explicit
Redistribute did not applied correctly inside the apply_motion_mutator function
because by that moment the value segidColIdx had been lost. The problem occured
because previously SubPlans had been mutated inside the ParallelizeSubplan
function, where the SubPlan's plan had been copied via copyObject function.
This function copies the whole plan including the Flow node, which is copied
via _copyFlow function. However the Flow node copying did not include the
copying of segidColIdx Flow field, which is used for valid performance of
Explicit Redistribute Motion.

Therefore, this patch solves the issue by adding segidColIdx to the list of
fields to copy, serialize, deserialize and compare in the _copyFlow, _outFlow,
_readFlow and _equalFlow functions respectively.

Cherry-picked from: 4a9aac4

(cherry picked from commit 55e1511)
The executor enabled the EXEC_FLAG_REWIND flag for all types of SubPlans: either
for InitPlans or correlated/uncorrelated SubPlans. This flag represents that the
rescan is expected and is used during initialization of executor state. However,
if a query had InitPlans, which contained non-rescannable nodes (like Split
Update in the tests), the executor failed with assertion error (for example,
when calling ExecInitSplitUpdate when initializing executor state inside the
ExecInitNode for the InitPlan).

Because InitPlans are essentially executed only once, there is no need to expect
a rescan of the InitPlan. Therefore, in order to support non-rescannable
operations inside the InitPlans, this patch disables EXEC_FLAG_REWIND flag for
InitPlans.

This patch partially returns vanilla postgres logic, which used
plannedstmt->rewindPlanIDs bitmapset for making a decision whether current
SubPlan should be executed with EXEC_REWIND flag. This bitmapset used to be
filled with the ids of such SubPlans, that could optimize the rescan
operation if the EXEC_REWIND is set, like parameterless subplans. Other
types of SubPlans were considered rescannable by default and there were
no need to set the EXEC_REWIND flag for them. However, GPDB interprets the
EXEC_REWIND flag as an indicator that the node is likely to be rescanned, and
also used this flag to delay the eager free. Therefore, this patch proposes
to fill plannedstmt->rewindPlanIDs set with all the subplans ids, except
InitPlans, and to set EXEC_REWIND flag only for those subplans that are
in the rewindPlanIDs bitmapset.

As for legacy optimizer, the if-clause influencing the filling of the
bitmapset is changed inside the build_subplan function in order to filter out
any InitPlans.

As for ORCA optimizer, rewindPlanIDs was not previously used, and this patch
adds a bunch of logic to fill this bitmapset with subplan ids. This
patch extends existing SetInitPlanVariables function and renames it to
SetSubPlanVariables. This function has originally been setting
nInitPlans and nParamExec in PlannedStmt, and also has been setting
qDispSliceId for each InitPlan, that is found during plan tree
traversal. This patch extends this behaviour and additionally fills the
rewindPlanIDs bitmapset for each SubPlan found, execept InitPlans.

At executor side, the condition checking whether the SubPlan is
in the planned_stmt->rewindPlanIDs is added to the InitPlan function.
From that point, SubPlans will be initialized with EXEC_REWIND flag
only if they are not InitPlans.

Ticket: ADBDEV-4059

Cherry-picked from: d0a5bc0

(cherry picked from commit ebd310c)
When a query had a modifying command inside the correlated SubPlan, the
ModifyTable node could be rescanned for each outer tuple. That lead to
execution errors (rescan of specific nodes is not supported). This happened
because the ParallelizeCorrelatedSubplanMutator function did not expect the
ModifyTable node inside the correlated SubPlans.

This patch adds the support of the ModifyTable nodes for correlated SubPlans.
Currently, ModifyTable node can get into the SubPlan only as the part of CTE
query, therefore, it can either be wrapped in the SubqueryScan node or be
standalone, depending on the SubqueryScan being trivial or not. This
patch affects ParallelizeCorrelatedSubplanMutator function. The patch extends
the if-clause dedicated to the choice of plan nodes, that need to be
broadcasted or focused, and then materialized. The specific conditions related
to modifying operations were added. These conditions checks whether current node
is the SubqueryScan with ModifyTable just under it or current node is a
standalone ModifyTable. If condition is satisfied, node is then processed the
same way as any other Scan-type nodes. Next, the result of ModifyTable is either
broadcasted or focused depending on the target flow type. Then the result is
materialized in order to avoid rescan of underlying nodes.

Cherry-picked-from: c164546
(cherry picked from commit 611ff8a)
The 6.26.3 sync contains changes (0c80dcf) that break ABI. Later in
df46c0b ignore file was added for version 6.26.1. But this tag is absent
in our fork and moreover, we should validate ABI changes against our
latest tag - 6.26.0_arenadata54. Because otherwise we got all changes
between upstream tag and this PR that includes our changes. What is why
I've created the ignore file's copy for our tag.
The 6.26.2 sync contains changes (b72ff6b) that break ABI. Later in
af90388 ignore file was added for version 6.26.1. But this tag is absent
in our fork and moreover, we should validate ABI changes against our
latest tag - 6.26.0_arenadata54. Because otherwise we got all changes
between upstream tag and this PR that includes our changes. What is why
I've duplicated this changes to the ignore file for our tag.
@Stolb27 Stolb27 marked this pull request as ready for review March 15, 2024 09:32
@Stolb27 Stolb27 requested a review from a team March 15, 2024 09:32
@BenderArenadata
Copy link

Allure report https://allure.adsw.io/launch/66749

@BenderArenadata
Copy link

Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1167915

@BenderArenadata
Copy link

Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1167916

@BenderArenadata
Copy link

Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1167910

@Stolb27 Stolb27 enabled auto-merge March 15, 2024 13:16
@Stolb27 Stolb27 merged commit cdc9332 into adb-6.x-dev Mar 15, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.