Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fallback to PO when a CTE consumer under hazard motion is found #302

Merged
merged 2 commits into from
Mar 28, 2022

Conversation

hughcapet
Copy link

@hughcapet hughcapet commented Feb 1, 2022

Orca currently doesn't handle CTEs over replicated tables properly:
while translating Expressions to DXL it simply cuts off all but one input segment
(changes the Motion subtree's ExecLocalityType from EeltSegments to EeltSingleton)
for a Motion, that has a child with Strict/Tainted Replicated/Universal distribution
without taking into account the possible CTE Consumers underneath.
This current logic actually violates results of the check previously made in
CUtils::ValidateCTEProducerConsumerLocality().
This can result in a hanging query because of the broken Producer-Consumer locality.
See: greenplum-db#13039 and ADBDEV-2351 ticket

This PR introduces a temporary fix for this issue by forcing a fallback
to Postgres Optimizer in the Expr2DXL translation, when a CTE Consumer without
the necessary CTE Producer is found under a duplicate-hazard motion.

@hughcapet hughcapet force-pushed the ADBDEV-2411-2 branch 3 times, most recently from f3ae5d6 to 8630664 Compare February 11, 2022 07:14
@hughcapet hughcapet marked this pull request as ready for review February 11, 2022 10:16
@hughcapet hughcapet force-pushed the ADBDEV-2411-2 branch 2 times, most recently from be50421 to 4f35f81 Compare February 24, 2022 07:56
@InnerLife0 InnerLife0 self-requested a review February 24, 2022 08:03
InnerLife0
InnerLife0 previously approved these changes Feb 24, 2022
InnerLife0
InnerLife0 previously approved these changes Feb 28, 2022
This commit introduces a temporary fix for greenplum-db/gpdb#13039 by
forcing a fallback to Postgres Optimizer in the Expr2DXL translation,
when a CTE Consumer without the necessary CTE Producer is found under
a duplicate-hazard motion.

Orca currently doesn't handle CTEs over replicated tables properly:
while translating Expressions to DXL it simply cuts off all but one
input segment (changes the Motion subtree's ExecLocalityType from
EeltSegments to EeltSingleton) for a Motion, that has a child with
Strict/Tainted Replicated/Universal distribution without taking into
account the possible CTE Consumers underneath.
This current logic actually violates results of the check previously
made in CUtils::ValidateCTEProducerConsumerLocality().
This can result in a hanging query because of the broken
Producer-Consumer locality.
@Stolb27 Stolb27 changed the base branch from adb-6.x to 6.20.1_arenadata33 March 28, 2022 11:02
@Stolb27 Stolb27 merged commit 51fe92e into 6.20.1_arenadata33 Mar 28, 2022
@Stolb27 Stolb27 deleted the ADBDEV-2411-2 branch March 28, 2022 11:20
@Stolb27 Stolb27 mentioned this pull request Jul 1, 2022
5 tasks
Stolb27 added a commit that referenced this pull request Jun 7, 2023
Stolb27 pushed a commit that referenced this pull request Jun 7, 2023
…#302)

This commit introduces a temporary fix for greenplum-db/gpdb#13039 by
forcing a fallback to Postgres Optimizer in the Expr2DXL translation,
when a CTE Consumer without the necessary CTE Producer is found under
a duplicate-hazard motion.

Orca currently doesn't handle CTEs over replicated tables properly:
while translating Expressions to DXL it simply cuts off all but one
input segment (changes the Motion subtree's ExecLocalityType from
EeltSegments to EeltSingleton) for a Motion, that has a child with
Strict/Tainted Replicated/Universal distribution without taking into
account the possible CTE Consumers underneath.
This current logic actually violates results of the check previously
made in CUtils::ValidateCTEProducerConsumerLocality().
This can result in a hanging query because of the broken
Producer-Consumer locality.

Cherry-picked from: 51fe92e
to solve conflicts with 68a9baf and incorporate formatting
modifications from #490
HustonMmmavr added a commit that referenced this pull request Jul 17, 2023
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)
Stolb27 pushed a commit that referenced this pull request Jul 19, 2023
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)
Stolb27 pushed a commit that referenced this pull request Jul 26, 2023
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: 2de3b73
to reapply above 781663c
Stolb27 pushed a commit that referenced this pull request Oct 2, 2023
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478
Stolb27 pushed a commit that referenced this pull request Mar 4, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
Stolb27 pushed a commit that referenced this pull request Mar 14, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
red1452 pushed a commit that referenced this pull request May 28, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request May 28, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request May 29, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request May 30, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request May 30, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request May 30, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request May 31, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
Stolb27 pushed a commit that referenced this pull request May 31, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
red1452 pushed a commit that referenced this pull request Jun 3, 2024
We assume that plan before hazard-motion transformation was validated
(such check exists) and CTE producers/consumers are consistent. The
next cases may appear after hazard-motion transformation:
1. all slices, which contains shared scans, was transformed
2. the slices, which contains CTE producers, and some CTE consumers
   (the plan may contains other slice with consumers of these
   producers) was transformed
3. the slice, which contains only producer/s, was transformed
4. the slice, which contains only consumer/s, was transformed
5. the slice, without CTE producers/consumers, was transformed

This patch is a follow-up for #302. Now detection of unpaired CTE's is
performed on the current slice (current motion) instead of recursing
into it's children slices (motions), also validation of unpaired CTE
was changed: previous approach collected all CTE consumers into array
and CTE producers was collected into set at the end each element of CTE
producer array was probed at producers hash-set (this may lead to a
false-negative cases when the slice contains two different  producers,
but only a single consumer). Now the sets of CTE consumers and
producers compares.

Current approach correctly validates 3, 4, 5 cases, while 1 case may
give a false-positive result. The 2 case maybe dangerous: there is a
possible situation, when the hazard-motion (slice) contains a CTE
producer and it's consumer, but the plan may contain other unmodified
motion (slice) with the CTE consumer of the producer from modified
slice, thus this approach won't reconnize that there are unpaired CTE
producers and consumers, because it checks only the consistency of
producers/consumers within the hazzard motion (one slice), on the other
side, this situation is possible if the replicated distribution
requested for the Sequence node, but due to this patch
(https://github.com/greenplum-db/gpdb/pull/15124) such situation
shouldn't appear.

Follow-up for #302 (51fe92e)

Cherry-picked-from: a5b5c04
to reapply above 9e03478

(cherry picked from commit 5a10d0a)
(cherry picked from commit 54ee058)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants