`prep_relation_expr` earlier in peek sequencing by ggevay · Pull Request #33533 · MaterializeInc/materialize

ggevay · 2025-09-06T18:42:48Z

This is fixing https://github.com/MaterializeInc/database-issues/issues/9645 by moving unmaterializable function evaluation (prep_relation_expr/prep_scalar_expr) earlier in peek sequencing. It also does the same move in COPY TO sequencing. Even though fast path is not a thing there, so the above issue can't occur, we'd still like to avoid unnecessary divergence between the peek sequencing and the COPY TO sequencing codes.

Also note that even independently from https://github.com/MaterializeInc/database-issues/issues/9645, it seems generally better to do mz_now() (and other unmaterializable func) evaluation before optimization, because constants are easier to work with during optimization than function calls. This is demonstrated by a few (somewhat contrived) tests flipping from slow path to fast path peeks.

The cause of https://github.com/MaterializeInc/database-issues/issues/9645 was a somewhat weird interaction between various things:

create_fast_path_plan is able to handle an MFP on top of a Get, but is not able to handle an MFP on top of a Constant. This limitation is understandable, give that we usually expect FoldConstants to make an MFP on top of a Constant disappear.
In the above issue, an MFP on top of a Constant was not disappearing, because it was containing an unmaterializable function call, which FoldConstants can't handle.
We have a special fast path optimizer, which we use during peek sequencing when a peek looks like a fast path peek between local and global optimization, so that we don't have to run the full global optimization pipeline. The soft assert that was tripped in the above issue is meant to verify that if a peek looked like a fast path peek before running the fast path optimizer, then we continue to be able to plan the peek as fast path also after running the fast path optimizer. The reason this got violated here is because the plan was an MFP on top of a Get before running the fast path optimizer, but was an MFP on top of a Constant after running the fast path optimizer.

After the PR moved unmaterializable function evaluation before (global or fast path) optimization, the fast path optimizer now doesn't leave the MFP on top of the Constant, because the MFP no longer has an unmaterializable function call at that point, so FoldConstants folds away the MFP. (The MFP's result is folded into a new Constant node.)

(Philosophically, the above 1. is a non-monotonicity in the optimizer, which is not good: Generally, it's good if optimization code has the behavior that if plan' is a better plan than plan, then optimize(plan') is a better plan than optimize(plan). As per 1., create_fast_path_plan is an optimization that doesn't have this property: an MFP on top of a constant is a better plan than an MFP on top of a Get, but create_fast_path_plan makes the former into a worse plan than the latter. Such non-monotonicities in the optimizer often lead to unpleasant surprises: if optimization O2 has a non-monotonicity, and we have an optimizer pipeline O1 -> O2, and we make a change to O1, then even if that change is a strict improvement locally for O1, the change can still regress the O1 -> O2 pipeline. So, it would be good to eliminate the non-monotonicity here by making create_fast_path_plan able to handle an MFP on top of a Constant, but I consider this out of scope for this PR. I opened a separate issue: https://github.com/MaterializeInc/database-issues/issues/9665)

This PR has a slightly annoying side-effect on slt testing: if an slt has an EXPLAIN OPTIMIZED PLAN on a peek that involves mz_now, the explain output will now show the evaluation result of the mz_now. This is a problem in tests, because mz_now changes at every slt run. I worked around this problem in two ways:

Made some tests create a materialize view instead of a peek. mz_now is not evaluated in materialized view plans. (filter-pushdown.slt)
Added a division to mz_now by 100000000000000 to make it always 0. I used this workaround instead of 1. in persist-fast-path.slt, because it was essential here for the test to remain a peek, since persist fast path is not relevant for materialized views.

@bkirwi may I ask you to check the test changes in filter-pushdown.slt and persist-fast-path.slt (explained in the previous paragraph) that I didn't accidentally invalidate any tests there?

@aljoscha, I tagged you for the overall PR review, because it's a peek sequencing thing. But if you'd like to avoid delving into optimization stuff at this moment, then let me know, and then I'll tag Michael instead.

Nightly: https://buildkite.com/materialize/nightly/builds/13207
New Nightly after fixing the problem in the 4-replica slt: https://buildkite.com/materialize/nightly/builds/13209

Motivation

This PR fixes a recognized bug: https://github.com/MaterializeInc/database-issues/issues/9645

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

aljoscha · 2025-09-08T08:22:45Z

Thanks for asking, but I'll try and stay away from optimizer business 😅

ggevay · 2025-09-08T08:49:01Z

Ok, then tagging @mgree instead for the overall review 😅

def- · 2025-09-08T19:00:45Z

I have included the change in #33520, no further occurrences!

bkirwi · 2025-09-08T19:08:07Z

Persist test changes seem fine to me - thanks for the explanation!

mgree

Just one question about timestamps, otherwise LGTM!

mgree · 2025-09-10T15:49:16Z

src/adapter/src/optimize/copy_to.rs

+            .expect("unique as_of element");
+
+        // Resolve all unmaterializable function calls including mz_now().
+        let style = ExprPrepStyle::OneShot {


Could it be the case that prep_relation_expr or prep_scalar_expr could care about until? If so, we should move the following block before this.

Fortunately, that can't happen, because we don't pass the DataflowDescription to prep_relation_expr or prep_scalar_expr, so they have no way of observing what until we have set on the DataflowDescription.

mgree · 2025-09-10T15:49:53Z

src/adapter/src/optimize/peek.rs

+            |r| prep_relation_expr(r, style),
+            |s| prep_scalar_expr(s, style),
+        )?;
+


Same concern about until here.

mgree · 2025-09-10T15:52:52Z

test/sqllogictest/persist-fast-path.slt


 query T multiline
-EXPLAIN OPTIMIZED PLAN WITH (humanized expressions) AS VERBOSE TEXT FOR SELECT * from numbers WHERE value > mz_now() LIMIT 10;
+EXPLAIN OPTIMIZED PLAN WITH (humanized expressions) AS VERBOSE TEXT FOR SELECT * from numbers WHERE value > mz_now()::text::int8 / 100000000000000 LIMIT 10;


Took me a second to notice the case here inducing the type change below. Looks good.

Fixes MaterializeInc/database-issues#9645

ggevay · 2025-09-10T16:05:53Z

Thanks for the review!

Prepping in two phases made sense before MaterializeInc#33533 (where phase 1 is with `EvalTime::Deferred`) because meaningful things were happening between the two phases, which were benefitting from phase 1 doing at least partial prepping. But after phase 2 was moved earlier in that PR, now the two phases are very close to each other, and the code between them is not sensitive to phase 1's existence. So, this commit just removes phase 1, and then also removes the now-unused `EvalTime::Deferred`.

Prepping in two phases made sense before #33533 (where phase 1 is with `EvalTime::Deferred`), because meaningful things were happening between the two phases, which were benefitting from phase 1 doing at least partial prepping. But after phase 2 was moved earlier in that PR, now the two phases are very close to each other, and the code between them is not sensitive to phase 1's existence. So, this commit just removes phase 1, and then also removes the now-unused `EvalTime::Deferred`.

ggevay force-pushed the prep-earlier branch 4 times, most recently from 678721a to e7da299 Compare September 7, 2025 18:20

ggevay added A-ADAPTER Topics related to the ADAPTER layer A-optimization Area: query optimization and transformation labels Sep 7, 2025

ggevay requested review from aljoscha and bkirwi September 7, 2025 18:26

ggevay marked this pull request as ready for review September 7, 2025 18:27

ggevay requested a review from a team as a code owner September 7, 2025 18:27

ggevay changed the title ~~adapter: prep_relation_expr earlier in peek sequencing~~ adapter: prep_relation_expr earlier in peek sequencing Sep 7, 2025

ggevay changed the title ~~adapter: prep_relation_expr earlier in peek sequencing~~ prep_relation_expr earlier in peek sequencing Sep 7, 2025

ggevay force-pushed the prep-earlier branch from e7da299 to 397f828 Compare September 8, 2025 06:56

ggevay requested review from mgree and removed request for aljoscha September 8, 2025 08:49

ggevay mentioned this pull request Sep 8, 2025

parallel-workload: Use more interesting expressions in SELECT, views and conditions; support COPY FROM STDIN; prepared statements; run 10x faster #33520

Merged

5 tasks

mgree approved these changes Sep 10, 2025

View reviewed changes

adapter: prep_relation_expr earlier in peek/copy_to sequencing

7d10f2b

Fixes MaterializeInc/database-issues#9645

ggevay force-pushed the prep-earlier branch from 397f828 to 7d10f2b Compare September 10, 2025 16:05

ggevay enabled auto-merge September 10, 2025 16:05

ggevay merged commit aeeb685 into MaterializeInc:main Sep 10, 2025
128 checks passed

ggevay mentioned this pull request Mar 29, 2026

adapter: prep only once in peek/copy_to optimization #35777

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`prep_relation_expr` earlier in peek sequencing#33533

`prep_relation_expr` earlier in peek sequencing#33533
ggevay merged 1 commit intoMaterializeInc:mainfrom
ggevay:prep-earlier

ggevay commented Sep 6, 2025 •

edited

Loading

Uh oh!

aljoscha commented Sep 8, 2025

Uh oh!

ggevay commented Sep 8, 2025

Uh oh!

def- commented Sep 8, 2025

Uh oh!

bkirwi commented Sep 8, 2025

Uh oh!

mgree left a comment

Uh oh!

mgree Sep 10, 2025

Uh oh!

ggevay Sep 10, 2025

Uh oh!

mgree Sep 10, 2025

Uh oh!

mgree Sep 10, 2025

Uh oh!

ggevay commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ggevay commented Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Tips for reviewer

Checklist

Uh oh!

aljoscha commented Sep 8, 2025

Uh oh!

ggevay commented Sep 8, 2025

Uh oh!

def- commented Sep 8, 2025

Uh oh!

bkirwi commented Sep 8, 2025

Uh oh!

mgree left a comment

Choose a reason for hiding this comment

Uh oh!

mgree Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

ggevay Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

mgree Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

mgree Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

ggevay commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ggevay commented Sep 6, 2025 •

edited

Loading