ADBDEV-2885-3: Postgres planner produces bogus plan for query to replicated table with SIRV function#378
ADBDEV-2885-3: Postgres planner produces bogus plan for query to replicated table with SIRV function#378RekGRpth wants to merge 5 commits into
Conversation
…sequence generation
The last query from the series
"set optimizer=off;
create table t_replicate_dst(id serial, i integer) distributed replicated;
create table t_replicate_src(i integer) distributed replicated;
insert into t_replicate_src select i from generate_series(1, 5) i;
insert into t_replicate_dst (i) select i from t_replicate_src;
select distinct id from gp_dist_random('t_replicate_dst');"
returned 15 lines (not 5) on cluster with 3 segments.
The plan of the second insert was:
Insert on t_replicate_dst
-> Seq Scan on t_replicate_src
The error is not reproduced on master, because when volatile functions are
detected in the query at the stage of adding the Insert node,
the Broadcast Motion node will be added. The plan on master is:
Insert on t_replicate_dst
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Seq Scan on t_replicate_src
After adding a check for the absence of volatile functions to the condition for
refusing to insert the Broadcast Motion node, the query plan became the same as
on master.
…ated table with SIRV function
| Output: $0 | ||
| InitPlan 1 (returns $0) (slice2) | ||
| -> Gather Motion 1:1 (slice1; segments: 1) | ||
| Output: i |
There was a problem hiding this comment.
@RekGRpth, can you explain how this patch makes this plan correct? At first glance, there is no volatile function in the target list.
There was a problem hiding this comment.
Since make_subplan does not set CdbLocusType_SingleQE and FLOW_SINGLETON for CdbLocusType_SegmentGeneral, then in ParallelizeSubplan value of containingPlanDistributed became false and plan is focused.
There was a problem hiding this comment.
containingPlanDistributed is calculated for outer for subplan plan. subPlanDistributed described subplan distribution.
|
The current solution breaks the next queries: Plan without the patch: Plan with patch: One more similar correlated subquery, that not worked properly and without patch: |
|
Example of broken query with UNcorrelated subquery: As we discussed a few days ago, we should handle all broadcastPlan usages properly. |
Why does query with subplan containing volatile functions on distributed replicated tables not make motion gather? Why is it incorrect to replace SegmentGeneral locus with SingleQE on top of subplan? Where will the presence of volatile functions be checked after your patch?
|
done |
done |
it covers by first version of #365 |
and it covers by same #365 |
yes, but I think it is other issue than segfault? And even #355 does not solve this |
Besides that fact, that #365 implement focus_plan and changes only focusPlan function, but affects broadcastPlan somehow (Broadcast Motion created). The main problem of existing implementation is using locus SingleQE that semantically doesn't fit replicated table with volatile function case (there is no data on QD in this case): CdbLocusType_SingleQE, /* a single backend process on any db: the
* qDisp itself, or a qExec started by a
* segment postmaster or the entry postmaster.
*/And you suggest adding one more kludge to this behavior in #365.
Maybe. BUT, the main reason for this problem is altering locus to SingleQE only for topmost node in subplans in make_subplan function. So, with #365 we will need one more kluge in the same place later. |
No, it does not does not affect broadcast! And it does not break these two cases! |
My bad. It's still work because topmost subplan node has SingleQE locus before broadcastPlan. But remaining comments are actual. |
yes |
Query with subplan containing volatile functions on distributed replicated tables does not make motion gather. This produces wrong plan and leads to segfault.
Steps to reproduce this problem:
Wrong plans:
Solving problem includes #355 and make motion gather for such subplan included replication tables with volatile functions.
Right plans: