Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve must_consolidate setting in single-time dataflow refinement #18732

Closed
Tracked by #14442
vmarcos opened this issue Apr 12, 2023 · 0 comments · Fixed by #19680
Closed
Tracked by #14442

Improve must_consolidate setting in single-time dataflow refinement #18732

vmarcos opened this issue Apr 12, 2023 · 0 comments · Fixed by #19680
Assignees
Labels
A-compute Area: compute A-dataflow Area: dataflow

Comments

@vmarcos
Copy link
Contributor

vmarcos commented Apr 12, 2023

As observed in #18546 (comment), there is an opportunity to set the must_consolidate field of monotonic operators to false on single-time plans where the input to the operator is already consolidated upstream. For example, if the input given to a monotonic top-k operator originates from a reduce operator, then we know that consolidation in the monotonic operator is unnecessary. The reduce operator already arranges its output, implicitly consolidating it; the latter results in a monotonic stream in a single-time context. Similarly, inputs originating from indexes or ArrangeBy nodes would become monotonic in a single-time context, and remain so if other intermediate operators do not break monotonicity.

This issue is a request to revisit the single-time dataflow refinement in finalize_dataflow that is introduced by #18546 and introduce additional reasoning to selectively set must_consolidate to false whenever possible, as explained above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compute Area: compute A-dataflow Area: dataflow
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant