Skip to content

prep_relation_expr earlier in peek sequencing#33533

Merged
ggevay merged 1 commit intoMaterializeInc:mainfrom
ggevay:prep-earlier
Sep 10, 2025
Merged

prep_relation_expr earlier in peek sequencing#33533
ggevay merged 1 commit intoMaterializeInc:mainfrom
ggevay:prep-earlier

Conversation

@ggevay
Copy link
Copy Markdown
Contributor

@ggevay ggevay commented Sep 6, 2025

This is fixing https://github.com/MaterializeInc/database-issues/issues/9645 by moving unmaterializable function evaluation (prep_relation_expr/prep_scalar_expr) earlier in peek sequencing. It also does the same move in COPY TO sequencing. Even though fast path is not a thing there, so the above issue can't occur, we'd still like to avoid unnecessary divergence between the peek sequencing and the COPY TO sequencing codes.

Also note that even independently from https://github.com/MaterializeInc/database-issues/issues/9645, it seems generally better to do mz_now() (and other unmaterializable func) evaluation before optimization, because constants are easier to work with during optimization than function calls. This is demonstrated by a few (somewhat contrived) tests flipping from slow path to fast path peeks.

The cause of https://github.com/MaterializeInc/database-issues/issues/9645 was a somewhat weird interaction between various things:

  1. create_fast_path_plan is able to handle an MFP on top of a Get, but is not able to handle an MFP on top of a Constant. This limitation is understandable, give that we usually expect FoldConstants to make an MFP on top of a Constant disappear.
  2. In the above issue, an MFP on top of a Constant was not disappearing, because it was containing an unmaterializable function call, which FoldConstants can't handle.
  3. We have a special fast path optimizer, which we use during peek sequencing when a peek looks like a fast path peek between local and global optimization, so that we don't have to run the full global optimization pipeline. The soft assert that was tripped in the above issue is meant to verify that if a peek looked like a fast path peek before running the fast path optimizer, then we continue to be able to plan the peek as fast path also after running the fast path optimizer. The reason this got violated here is because the plan was an MFP on top of a Get before running the fast path optimizer, but was an MFP on top of a Constant after running the fast path optimizer.

After the PR moved unmaterializable function evaluation before (global or fast path) optimization, the fast path optimizer now doesn't leave the MFP on top of the Constant, because the MFP no longer has an unmaterializable function call at that point, so FoldConstants folds away the MFP. (The MFP's result is folded into a new Constant node.)

(Philosophically, the above 1. is a non-monotonicity in the optimizer, which is not good: Generally, it's good if optimization code has the behavior that if plan' is a better plan than plan, then optimize(plan') is a better plan than optimize(plan). As per 1., create_fast_path_plan is an optimization that doesn't have this property: an MFP on top of a constant is a better plan than an MFP on top of a Get, but create_fast_path_plan makes the former into a worse plan than the latter. Such non-monotonicities in the optimizer often lead to unpleasant surprises: if optimization O2 has a non-monotonicity, and we have an optimizer pipeline O1 -> O2, and we make a change to O1, then even if that change is a strict improvement locally for O1, the change can still regress the O1 -> O2 pipeline. So, it would be good to eliminate the non-monotonicity here by making create_fast_path_plan able to handle an MFP on top of a Constant, but I consider this out of scope for this PR. I opened a separate issue: https://github.com/MaterializeInc/database-issues/issues/9665)

This PR has a slightly annoying side-effect on slt testing: if an slt has an EXPLAIN OPTIMIZED PLAN on a peek that involves mz_now, the explain output will now show the evaluation result of the mz_now. This is a problem in tests, because mz_now changes at every slt run. I worked around this problem in two ways:

  1. Made some tests create a materialize view instead of a peek. mz_now is not evaluated in materialized view plans. (filter-pushdown.slt)
  2. Added a division to mz_now by 100000000000000 to make it always 0. I used this workaround instead of 1. in persist-fast-path.slt, because it was essential here for the test to remain a peek, since persist fast path is not relevant for materialized views.

@bkirwi may I ask you to check the test changes in filter-pushdown.slt and persist-fast-path.slt (explained in the previous paragraph) that I didn't accidentally invalidate any tests there?

@aljoscha, I tagged you for the overall PR review, because it's a peek sequencing thing. But if you'd like to avoid delving into optimization stuff at this moment, then let me know, and then I'll tag Michael instead.

Nightly: https://buildkite.com/materialize/nightly/builds/13207
New Nightly after fixing the problem in the 4-replica slt: https://buildkite.com/materialize/nightly/builds/13209

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@ggevay ggevay force-pushed the prep-earlier branch 4 times, most recently from 678721a to e7da299 Compare September 7, 2025 18:20
@ggevay ggevay added A-ADAPTER Topics related to the ADAPTER layer A-optimization Area: query optimization and transformation labels Sep 7, 2025
@ggevay ggevay requested review from aljoscha and bkirwi September 7, 2025 18:26
@ggevay ggevay marked this pull request as ready for review September 7, 2025 18:27
@ggevay ggevay requested a review from a team as a code owner September 7, 2025 18:27
@ggevay ggevay changed the title adapter: prep_relation_expr earlier in peek sequencing adapter: prep_relation_expr earlier in peek sequencing Sep 7, 2025
@ggevay ggevay changed the title adapter: prep_relation_expr earlier in peek sequencing prep_relation_expr earlier in peek sequencing Sep 7, 2025
@aljoscha
Copy link
Copy Markdown
Contributor

aljoscha commented Sep 8, 2025

Thanks for asking, but I'll try and stay away from optimizer business 😅

@ggevay
Copy link
Copy Markdown
Contributor Author

ggevay commented Sep 8, 2025

Ok, then tagging @mgree instead for the overall review 😅

@def-
Copy link
Copy Markdown
Contributor

def- commented Sep 8, 2025

I have included the change in #33520, no further occurrences!

@bkirwi
Copy link
Copy Markdown
Contributor

bkirwi commented Sep 8, 2025

Persist test changes seem fine to me - thanks for the explanation!

Copy link
Copy Markdown
Contributor

@mgree mgree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question about timestamps, otherwise LGTM!

.expect("unique as_of element");

// Resolve all unmaterializable function calls including mz_now().
let style = ExprPrepStyle::OneShot {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be the case that prep_relation_expr or prep_scalar_expr could care about until? If so, we should move the following block before this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fortunately, that can't happen, because we don't pass the DataflowDescription to prep_relation_expr or prep_scalar_expr, so they have no way of observing what until we have set on the DataflowDescription.

|r| prep_relation_expr(r, style),
|s| prep_scalar_expr(s, style),
)?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern about until here.


query T multiline
EXPLAIN OPTIMIZED PLAN WITH (humanized expressions) AS VERBOSE TEXT FOR SELECT * from numbers WHERE value > mz_now() LIMIT 10;
EXPLAIN OPTIMIZED PLAN WITH (humanized expressions) AS VERBOSE TEXT FOR SELECT * from numbers WHERE value > mz_now()::text::int8 / 100000000000000 LIMIT 10;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a second to notice the case here inducing the type change below. Looks good.

@ggevay
Copy link
Copy Markdown
Contributor Author

ggevay commented Sep 10, 2025

Thanks for the review!

@ggevay ggevay enabled auto-merge September 10, 2025 16:05
@ggevay ggevay merged commit aeeb685 into MaterializeInc:main Sep 10, 2025
128 checks passed
ggevay added a commit to ggevay/materialize that referenced this pull request Mar 29, 2026
Prepping in two phases made sense before
MaterializeInc#33533
(where phase 1 is with `EvalTime::Deferred`)
because meaningful things were happening between the two phases,
which were benefitting from phase 1 doing at least partial
prepping. But after phase 2 was moved earlier in that PR, now the
two phases are very close to each other, and the code between them
is not sensitive to phase 1's existence. So, this commit just
removes phase 1, and then also removes the now-unused
`EvalTime::Deferred`.
ggevay added a commit that referenced this pull request Apr 1, 2026
Prepping in two phases made sense before
#33533 (where phase 1
is with `EvalTime::Deferred`), because meaningful things were happening
between the two phases, which were benefitting from phase 1 doing at
least partial prepping. But after phase 2 was moved earlier in that PR,
now the two phases are very close to each other, and the code between
them is not sensitive to phase 1's existence. So, this commit just
removes phase 1, and then also removes the now-unused
`EvalTime::Deferred`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ADAPTER Topics related to the ADAPTER layer A-optimization Area: query optimization and transformation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants