Skip to content

Collapse generate_series when output columns are projected away#22753

Closed
frankmcsherry wants to merge 3 commits into
mainfrom
collapse_series
Closed

Collapse generate_series when output columns are projected away#22753
frankmcsherry wants to merge 3 commits into
mainfrom
collapse_series

Conversation

@frankmcsherry
Copy link
Copy Markdown
Contributor

This PR updates projection pushdown for FlatMap nodes to notice example cases where output columns are unused, and the number of records can be determined from the data. This is essentially only in support of optimizing generate_series, which has historically forced complete unpacking for something like

select count(*) from generate_series(1, 1000000);

This now optimizes to the constant 1000000 rather than a dataflow that deploys to build one million rows and count them.

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@frankmcsherry frankmcsherry requested a review from a team October 27, 2023 19:18
Copy link
Copy Markdown
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, although we probably want a different rewrite mechanism for the types that we could support...

Well, and some of the now-failing tests need to be thought through, because they rely on undocumented behavior of generate_series.

Comment on lines +454 to +457
if limit_remaining < 1 as usize {
return None;
}
limit_remaining -= *diff as usize;
limit_remaining -= 1 as usize;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this doesn't need a cast.

Copy link
Copy Markdown
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example select count(*) from generate_series(1, 1000000); panics for me (Debug build):

thread 'tokio:work-69' panicked at src/expr/src/relation/func.rs:2480:15:
Datum::unwrap_int64 called on Int32(1000000)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/std/src/panicking.rs:597:5
   1: core::panicking::panic_fmt
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/core/src/panicking.rs:72:14
   2: mz_repr::scalar::Datum::unwrap_int64
             at ./src/repr/src/scalar.rs:582:18
   3: mz_expr::relation::func::repeat
             at ./src/expr/src/relation/func.rs:2480:13
   4: mz_expr::relation::func::TableFunc::eval
             at ./src/expr/src/relation/func.rs:2727:46
   5: mz_transform::fold_constants::FoldConstants::fold_flat_map_constant
             at ./src/transform/src/fold_constants.rs:619:35
   6: mz_transform::fold_constants::FoldConstants::action
             at ./src/transform/src/fold_constants.rs:218:37
   7: <mz_transform::fold_constants::FoldConstants as mz_transform::Transform>::transform::{{closure}}
             at ./src/transform/src/fold_constants.rs:51:13
   8: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post::{{closure}}
             at ./src/expr/src/visit.rs:556:13
   9: mz_ore::stack::CheckedRecursion::checked_recur::{{closure}}
             at ./src/ore/src/stack.rs:191:33
  10: stacker::maybe_grow
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/stacker-0.1.15/src/lib.rs:55:9
  11: mz_ore::stack::maybe_grow
             at ./src/ore/src/stack.rs:100:5
  12: mz_ore::stack::CheckedRecursion::checked_recur
             at ./src/ore/src/stack.rs:191:19
  13: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post
             at ./src/expr/src/visit.rs:554:9
  14: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post::{{closure}}::{{closure}}
             at ./src/expr/src/visit.rs:555:50
  15: <mz_expr::relation::MirRelationExpr as mz_expr::visit::VisitChildren<mz_expr::relation::MirRelationExpr>>::try_visit_mut_children
             at ./src/expr/src/relation/mod.rs:2233:13
  16: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post::{{closure}}
             at ./src/expr/src/visit.rs:555:13
  17: mz_ore::stack::CheckedRecursion::checked_recur::{{closure}}
             at ./src/ore/src/stack.rs:191:33
  18: stacker::maybe_grow
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/stacker-0.1.15/src/lib.rs:55:9
  19: mz_ore::stack::maybe_grow
             at ./src/ore/src/stack.rs:100:5
  20: mz_ore::stack::CheckedRecursion::checked_recur
             at ./src/ore/src/stack.rs:191:19
  21: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post
             at ./src/expr/src/visit.rs:554:9
  22: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post::{{closure}}::{{closure}}
             at ./src/expr/src/visit.rs:555:50
  23: <mz_expr::relation::MirRelationExpr as mz_expr::visit::VisitChildren<mz_expr::relation::MirRelationExpr>>::try_visit_mut_children
             at ./src/expr/src/relation/mod.rs:2233:13
  24: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post::{{closure}}
             at ./src/expr/src/visit.rs:555:13
  25: mz_ore::stack::CheckedRecursion::checked_recur::{{closure}}
             at ./src/ore/src/stack.rs:191:33
  26: stacker::maybe_grow
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/stacker-0.1.15/src/lib.rs:55:9
  27: mz_ore::stack::maybe_grow
             at ./src/ore/src/stack.rs:100:5
  28: mz_ore::stack::CheckedRecursion::checked_recur
             at ./src/ore/src/stack.rs:191:19
  29: mz_expr::visit::StackSafeVisit<T>::try_visit_mut_post
             at ./src/expr/src/visit.rs:554:9
  30: <T as mz_expr::visit::Visit>::try_visit_mut_post
             at ./src/expr/src/visit.rs:345:9
  31: <mz_transform::fold_constants::FoldConstants as mz_transform::Transform>::transform
             at ./src/transform/src/fold_constants.rs:47:22
  32: <mz_transform::Fixpoint as mz_transform::Transform>::transform::{{closure}}
             at ./src/transform/src/lib.rs:331:25
  33: tracing::span::Span::in_scope
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/span.rs:1102:9
  34: <mz_transform::Fixpoint as mz_transform::Transform>::transform
             at ./src/transform/src/lib.rs:329:17
  35: mz_transform::Optimizer::transform
             at ./src/transform/src/lib.rs:715:13
  36: mz_transform::dataflow::optimize_dataflow_relations
             at ./src/transform/src/dataflow.rs:229:9
  37: mz_transform::dataflow::optimize_dataflow
             at ./src/transform/src/dataflow.rs:77:5
  38: mz_adapter::coord::sequencer::inner::<impl mz_adapter::coord::Coordinator>::optimize_peek
             at ./src/adapter/src/coord/sequencer/inner.rs:2317:33
  39: mz_adapter::coord::sequencer::inner::<impl mz_adapter::coord::Coordinator>::peek_stage_optimize::{{closure}}::{{closure}}
             at ./src/adapter/src/coord/sequencer/inner.rs:2240:27
  40: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/alloc/src/boxed.rs:2007:9
  41: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/task.rs:42:21
  42: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:334:17
  43: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/loom/std/unsafe_cell.rs:16:9
  44: tokio::runtime::task::core::Core<T,S>::poll
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:323:13
  45: tokio::runtime::task::harness::poll_future::{{closure}}
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:485:19
  46: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/core/src/panic/unwind_safe.rs:271:9
  47: std::panicking::try::do_call
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/std/src/panicking.rs:504:40
  48: __rust_try
  49: std::panicking::try
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/std/src/panicking.rs:468:19
  50: std::panic::catch_unwind
             at /rustc/187b8131d4f760f856b214fce34534903276f2ef/library/std/src/panic.rs:142:14
  51: tokio::runtime::task::harness::poll_future
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:473:18
  52: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:208:27
  53: tokio::runtime::task::harness::Harness<T,S>::poll
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:153:15
  54: tokio::runtime::task::raw::poll
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/raw.rs:276:5
  55: tokio::runtime::task::raw::RawTask::poll
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/raw.rs:200:18
  56: tokio::runtime::task::UnownedTask<S>::run
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/mod.rs:437:9
  57: tokio::runtime::blocking::pool::Task::run
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/pool.rs:159:9
  58: tokio::runtime::blocking::pool::Inner::run
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/pool.rs:513:17
  59: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at /home/deen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/pool.rs:471:13

@def-
Copy link
Copy Markdown
Contributor

def- commented Oct 27, 2023

Explaining queries containing generate_series go OoM now, but worked before:

explain select count(*) from generate_series(0, 999) AS x, generate_series(0, 999) AS y, generate_series(0, 999) AS z;

(Similar issue for csv_extract: https://github.com/MaterializeInc/database-issues/issues/6871)

@frankmcsherry
Copy link
Copy Markdown
Contributor Author

The first issue is fixed, sorry! The second issue goes a bit deeper, and while I can prevent the OOM, I cannot yet prevent about 45s of "counting up to one billion" on some core. At the heart of it, the AggregateFunc::eval methods require iterators over Datum, which means we need to provide one billion records. That can be done memory-efficiently with an iterator, but seemingly not compute-efficiently, that I have found. I filed https://github.com/MaterializeInc/database-issues/issues/6932 as a follow-up issue, but perhaps the right thing to do is get the changes in that turn this into a CPU issue rather than memory issue, and not land the PR.

@frankmcsherry
Copy link
Copy Markdown
Contributor Author

One plausible thing to do is revert the relaxation to fold_reduce_constant that admits constant relations that are small in binary but large in unary, which results in select count(*) from generate_series(1, 1000000) turning in to a non-constant plan that must be shipped to a compute cluster, but at least only takes a mere moment while it uses its binary count implementation.

Independently, the changes to the rest of fold_reduce_constant merit some thought, as it replaces some unpleasant behavior with what I think is a more robust implementation.

@frankmcsherry frankmcsherry deleted the collapse_series branch February 5, 2025 11:53
This was referenced May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants