prep_relation_expr earlier in peek sequencing#33533
prep_relation_expr earlier in peek sequencing#33533ggevay merged 1 commit intoMaterializeInc:mainfrom
prep_relation_expr earlier in peek sequencing#33533Conversation
678721a to
e7da299
Compare
prep_relation_expr earlier in peek sequencing
prep_relation_expr earlier in peek sequencingprep_relation_expr earlier in peek sequencing
|
Thanks for asking, but I'll try and stay away from optimizer business 😅 |
|
Ok, then tagging @mgree instead for the overall review 😅 |
|
I have included the change in #33520, no further occurrences! |
|
Persist test changes seem fine to me - thanks for the explanation! |
mgree
left a comment
There was a problem hiding this comment.
Just one question about timestamps, otherwise LGTM!
| .expect("unique as_of element"); | ||
|
|
||
| // Resolve all unmaterializable function calls including mz_now(). | ||
| let style = ExprPrepStyle::OneShot { |
There was a problem hiding this comment.
Could it be the case that prep_relation_expr or prep_scalar_expr could care about until? If so, we should move the following block before this.
There was a problem hiding this comment.
Fortunately, that can't happen, because we don't pass the DataflowDescription to prep_relation_expr or prep_scalar_expr, so they have no way of observing what until we have set on the DataflowDescription.
| |r| prep_relation_expr(r, style), | ||
| |s| prep_scalar_expr(s, style), | ||
| )?; | ||
|
|
There was a problem hiding this comment.
Same concern about until here.
|
|
||
| query T multiline | ||
| EXPLAIN OPTIMIZED PLAN WITH (humanized expressions) AS VERBOSE TEXT FOR SELECT * from numbers WHERE value > mz_now() LIMIT 10; | ||
| EXPLAIN OPTIMIZED PLAN WITH (humanized expressions) AS VERBOSE TEXT FOR SELECT * from numbers WHERE value > mz_now()::text::int8 / 100000000000000 LIMIT 10; |
There was a problem hiding this comment.
Took me a second to notice the case here inducing the type change below. Looks good.
397f828 to
7d10f2b
Compare
|
Thanks for the review! |
Prepping in two phases made sense before MaterializeInc#33533 (where phase 1 is with `EvalTime::Deferred`) because meaningful things were happening between the two phases, which were benefitting from phase 1 doing at least partial prepping. But after phase 2 was moved earlier in that PR, now the two phases are very close to each other, and the code between them is not sensitive to phase 1's existence. So, this commit just removes phase 1, and then also removes the now-unused `EvalTime::Deferred`.
Prepping in two phases made sense before #33533 (where phase 1 is with `EvalTime::Deferred`), because meaningful things were happening between the two phases, which were benefitting from phase 1 doing at least partial prepping. But after phase 2 was moved earlier in that PR, now the two phases are very close to each other, and the code between them is not sensitive to phase 1's existence. So, this commit just removes phase 1, and then also removes the now-unused `EvalTime::Deferred`.
This is fixing https://github.com/MaterializeInc/database-issues/issues/9645 by moving unmaterializable function evaluation (
prep_relation_expr/prep_scalar_expr) earlier in peek sequencing. It also does the same move in COPY TO sequencing. Even though fast path is not a thing there, so the above issue can't occur, we'd still like to avoid unnecessary divergence between the peek sequencing and the COPY TO sequencing codes.Also note that even independently from https://github.com/MaterializeInc/database-issues/issues/9645, it seems generally better to do
mz_now()(and other unmaterializable func) evaluation before optimization, because constants are easier to work with during optimization than function calls. This is demonstrated by a few (somewhat contrived) tests flipping from slow path to fast path peeks.The cause of https://github.com/MaterializeInc/database-issues/issues/9645 was a somewhat weird interaction between various things:
create_fast_path_planis able to handle an MFP on top of a Get, but is not able to handle an MFP on top of a Constant. This limitation is understandable, give that we usually expect FoldConstants to make an MFP on top of a Constant disappear.After the PR moved unmaterializable function evaluation before (global or fast path) optimization, the fast path optimizer now doesn't leave the MFP on top of the Constant, because the MFP no longer has an unmaterializable function call at that point, so FoldConstants folds away the MFP. (The MFP's result is folded into a new Constant node.)
(Philosophically, the above 1. is a non-monotonicity in the optimizer, which is not good: Generally, it's good if optimization code has the behavior that if
plan'is a better plan thanplan, thenoptimize(plan')is a better plan thanoptimize(plan). As per 1.,create_fast_path_planis an optimization that doesn't have this property: an MFP on top of a constant is a better plan than an MFP on top of a Get, butcreate_fast_path_planmakes the former into a worse plan than the latter. Such non-monotonicities in the optimizer often lead to unpleasant surprises: if optimizationO2has a non-monotonicity, and we have an optimizer pipelineO1 -> O2, and we make a change toO1, then even if that change is a strict improvement locally forO1, the change can still regress theO1 -> O2pipeline. So, it would be good to eliminate the non-monotonicity here by makingcreate_fast_path_planable to handle an MFP on top of a Constant, but I consider this out of scope for this PR. I opened a separate issue: https://github.com/MaterializeInc/database-issues/issues/9665)This PR has a slightly annoying side-effect on slt testing: if an slt has an EXPLAIN OPTIMIZED PLAN on a peek that involves mz_now, the explain output will now show the evaluation result of the mz_now. This is a problem in tests, because mz_now changes at every slt run. I worked around this problem in two ways:
mz_nowis not evaluated in materialized view plans. (filter-pushdown.slt)mz_nowby 100000000000000 to make it always 0. I used this workaround instead of 1. inpersist-fast-path.slt, because it was essential here for the test to remain a peek, since persist fast path is not relevant for materialized views.@bkirwi may I ask you to check the test changes in
filter-pushdown.sltandpersist-fast-path.slt(explained in the previous paragraph) that I didn't accidentally invalidate any tests there?@aljoscha, I tagged you for the overall PR review, because it's a peek sequencing thing. But if you'd like to avoid delving into optimization stuff at this moment, then let me know, and then I'll tag Michael instead.
Nightly: https://buildkite.com/materialize/nightly/builds/13207
New Nightly after fixing the problem in the 4-replica slt: https://buildkite.com/materialize/nightly/builds/13209
Motivation
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.