Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show planner error for SELECT INTO with DML in CTE #964

Merged
merged 9 commits into from
Aug 8, 2024
Merged

Conversation

whitehawk
Copy link

@whitehawk whitehawk commented May 31, 2024

Show planner error for SELECT INTO with DML in CTE

Problem:
Greenplum fails to execute SELECT INTO and CREATE TABLE AS statements, whose
queries contain modifying CTEs, because Greenplum cannot have two writer
segworker groups for one query. In current implementation all the plans with
modifying CTE contain motion nodes, which separate slice with DML operation
from the slice, where the new table is created. And during execution stage an
error is thrown, telling that node with DML operation cannot be executed,
because one writer segworker already exists. However, showing the error during
planning stage would be more effective, therefore this commit proposes the
display of the error during planning.

This commit counts the number of writing slices during the plan tree traversal.
If there are more than one writing gang, it throws an error. In case writing
slice is found in the InitPlan, it is counted separately, as InitPlans are
executed independently.

Tests are partially taken from commit: cc2e30f

Co-authored-by: Alexander Kondakov a.kondakov@arenadata.io

@whitehawk whitehawk marked this pull request as ready for review June 3, 2024 05:17
@RekGRpth

This comment was marked as resolved.

RekGRpth
RekGRpth previously approved these changes Jun 3, 2024
@RekGRpth

This comment was marked as resolved.

Copy link

@KnightMurloc KnightMurloc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we can move the check to the analysis stage in the transformCreateTableAsStmt function. It may also make sense to make the error message more understandable to the user. Something like: "CREATE TABLE AS must not use data-modifying statements in WITH"

src/test/regress/sql/with_clause.sql Outdated Show resolved Hide resolved
@RekGRpth

This comment was marked as resolved.

@bimboterminator1
Copy link
Member

bimboterminator1 commented Jun 7, 2024

@whitehawk , @KnightMurloc
We must admit that checking hasModifyingCTE value is not enough to cover all the cases. For example, the CTE may not be refernced at all:

explain (costs off)
with cte as
(insert into with_dml select i, i * 100 from generate_series(1, 5) i returning *)
select into t_new from t1;
ERROR:  cannot create plan with several writing gang

Probably merge 944 before and use the same walker here. It means that porting original patch now doesn't make sence and all kinds of refractoring are appropriate.

I suggest considering complex check like its done in #944 .

@KnightMurloc
Copy link

Isn't it a bug that if we don't have references to a modifying CTE, then we don't perform such a cte at all?
Also, if the DML is in the Init Plan, then we can execute it together with CTAS.

create table t2 as with cte1 as (
    insert into t values(1) returning *
),
cte2 as (
    select * from unnest(
        array(
            select * from cte1
        )
    )
) select * from cte2;
                                        QUERY PLAN                                        
------------------------------------------------------------------------------------------
 Redistribute Motion 1:3  (slice1; segments: 1)  (cost=0.07..0.31 rows=3 width=4)
   Hash Key: unnest.unnest
   ->  Function Scan on unnest  (cost=0.07..0.17 rows=10 width=4)
         InitPlan 1 (returns $1)  (slice2)
           ->  Gather Motion 1:1  (slice3; segments: 1)  (cost=0.00..0.07 rows=3 width=4)
                 ->  Insert on t  (cost=0.00..0.03 rows=1 width=4)
                       ->  Result  (cost=0.00..0.01 rows=1 width=4)
 Optimizer: Postgres-based planner
(8 rows)

@bimboterminator1
Copy link
Member

bimboterminator1 commented Jun 7, 2024

Isn't it a bug that if we don't have references to a modifying CTE, then we don't perform such a cte at all?

If it's not refernced, it is not inlined and not executed. GP differs from PG in this situation cause GP does not have CteScan node

Also, if the DML is in the Init Plan, then we can execute it together with CTAS.

There is another case when cte has locus Replicated and there is also possibility to execute it with CTAS. However, if the main approach is throwing an error, making exclusions in this logic will look weird. I doubt that user will need this type of query at all. All such cases should be resolved by allowing executing several writing gangs, not here.

@Stolb27 Stolb27 marked this pull request as draft June 10, 2024 06:18
@whitehawk whitehawk changed the base branch from adb-7.1.0 to adb-7.2.0 July 26, 2024 02:34
@whitehawk whitehawk marked this pull request as ready for review July 29, 2024 04:23
@RekGRpth

This comment was marked as resolved.

@whitehawk
Copy link
Author

The current solution in 7X differs from the similar solution in 6X (in which modifying CTEs are prohibited when creating a table even if they are not used).

Yes, it is done so according to comment #964 (comment) .

Copy link

@KnightMurloc KnightMurloc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution also throw error on the plans in which DML is executed in InitPlan, which, as I understood from the discussion, we want to keep.

src/backend/cdb/cdbllize.c Outdated Show resolved Hide resolved
src/backend/cdb/cdbllize.c Outdated Show resolved Hide resolved
@Stolb27 Stolb27 enabled auto-merge (squash) August 7, 2024 06:35
@Stolb27 Stolb27 merged commit afac0f9 into adb-7.2.0 Aug 8, 2024
5 checks passed
@Stolb27 Stolb27 deleted the ADBDEV-5523 branch August 8, 2024 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants