-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect result of min_by/max_by(x, y, n) in Window operation #21653
Labels
Comments
cc @mbasmanova |
CC: @arhimondr |
kagamiori
changed the title
Inconsistent behavior between min/max(x, n) and min_by/max_by(x, y, n) in Window operation
Incorrect result of min_by/max_by(x, y, n) in Window operation
Jan 12, 2024
6 tasks
kagamiori
added a commit
to kagamiori/velox
that referenced
this issue
Jan 26, 2024
Summary: Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. Differential Revision: D53139892
kagamiori
added a commit
to kagamiori/velox
that referenced
this issue
Jan 30, 2024
Summary: Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Differential Revision: D53139892
kagamiori
added a commit
to kagamiori/velox
that referenced
this issue
Feb 8, 2024
Summary: Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Differential Revision: D53139892
facebook-github-bot
pushed a commit
to facebookincubator/velox
that referenced
this issue
Feb 12, 2024
Summary: Pull Request resolved: #8566 Same as bug in min/max(x, n) fixed in #8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes #8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
FelixYBW
pushed a commit
to FelixYBW/velox
that referenced
this issue
Feb 12, 2024
Summary: Pull Request resolved: facebookincubator#8566 Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
FelixYBW
pushed a commit
to FelixYBW/velox
that referenced
this issue
Feb 12, 2024
Summary: Pull Request resolved: facebookincubator#8566 Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
FelixYBW
pushed a commit
to FelixYBW/velox
that referenced
this issue
Feb 12, 2024
Summary: Pull Request resolved: facebookincubator#8566 Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi community, I noticed min_by/max_by(x, y, n) produces incorrect results when it is used in Window operation. As an example, for the query below with
min_by(x, y, n)
. Since the window frame isUNBOUNDED PRECEDING
toCURRENT ROW
, when the second row is processed, the function should aggregate both the first and second input rows and hence there should be two values in the result array, i.e., [1, 2]. However, the current result of the second row is [2] only.This error is caused by AbstractMinMaxByNAggregationFunction::output() breaking the assumption for "expanding frame" in AggregateWindowFunction::processRow(). AggregateWindowFunction::processRow() has an optimization branch for "same or expanding frame" of the previously computed frame, where it only add additional input rows of the new frame to accumulator.
presto/presto-main/src/main/java/com/facebook/presto/operator/window/AggregateWindowFunction.java
Lines 67 to 71 in d0b658e
This draws an implicit assumption that the content of the accumulator of the previous frame remains in the accumulator when processing the current frame. This, however, is not true with min_by because AbstractMinMaxByNAggregationFunction::output() clears the accumulator.
presto/presto-main/src/main/java/com/facebook/presto/operator/aggregation/minmaxby/AbstractMinMaxByNAggregationFunction.java
Line 147 in cb582bc
Your Environment
Expected Behavior
Expected result is
Current Behavior
Current result is
Possible Solution
Add
heap.addAll(reversedBlockBuilder);
to the end of AbstractMinMaxByNAggregationFunction::output() before out.closeEntry().Steps to Reproduce
Screenshots (if appropriate)
Context
The text was updated successfully, but these errors were encountered: