New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix min/max(x, n) #8311
Fix min/max(x, n) #8311
Conversation
✅ Deploy Preview for meta-velox canceled.
|
This pull request was exported from Phabricator. Differential Revision: D52638334 |
Summary: In Presto, accumulators of min/max(x, n) do not clear the heap when values are extracted from accumulator. But in Velox they do. Fix this bug to make Velox behavior align with Presto. This diff fixes facebookincubator#8138. Differential Revision: D52638334
Summary: In Presto, accumulators of min/max(x, n) do not clear the heap when values are extracted from accumulator. But in Velox they do. Fix this bug to make Velox behavior align with Presto. This diff fixes facebookincubator#8138. Differential Revision: D52638334
2114635
to
bf002e6
Compare
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Differential Revision: D52638334
This pull request was exported from Phabricator. Differential Revision: D52638334 |
.window({"max(c0, c1) over (partition by c2 order by c3 asc)"}) | ||
.planNode(); | ||
|
||
auto result = AssertQueryBuilder(plan).copyResults(pool()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyResults -> assertResults(expected)
if (auto field = | ||
dynamic_cast<const core::FieldAccessTypedExpr*>(arg.get())) { | ||
auto channel = type.getChildIdx(field->name()); | ||
columns.push_back(input->childAt(channel)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input->childAt(field->name());
/// Returns child vector for the specified field name. Throws if field with
/// specified name doesn't exist.
VectorPtr& childAt(const std::string& name) {
return children_[type_->asRow().getChildIdx(name)];
}
const core::CallTypedExprPtr& aggregateExpr, | ||
const RowVectorPtr& input, | ||
memory::MemoryPool* pool) { | ||
auto type = input->type()->as<TypeKind::ROW>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asRow()
std::vector<VectorPtr> columns; | ||
for (const auto& arg : aggregateExpr->inputs()) { | ||
if (auto field = | ||
dynamic_cast<const core::FieldAccessTypedExpr*>(arg.get())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypedExprs::asFieldAccess(arg)
auto channel = type.getChildIdx(field->name()); | ||
columns.push_back(input->childAt(channel)); | ||
} | ||
if (dynamic_cast<const core::ConstantTypedExpr*>(arg.get())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isConstant(arg)
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Differential Revision: D52638334
bf002e6
to
ba4bd00
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Differential Revision: D52638334
ba4bd00
to
170a048
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Differential Revision: D52638334
170a048
to
6940296
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
6940296
to
42ac073
Compare
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Differential Revision: D52638334
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Differential Revision: D52638334
42ac073
to
191b1b1
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D52638334 |
This pull request was exported from Phabricator. Differential Revision: D52638334 |
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Reviewed By: mbasmanova Differential Revision: D52638334
b76d946
to
c19795c
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
c19795c
to
1b9c66c
Compare
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Reviewed By: mbasmanova Differential Revision: D52638334
This pull request was exported from Phabricator. Differential Revision: D52638334 |
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Reviewed By: mbasmanova Differential Revision: D52638334
1b9c66c
to
a708dc4
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
Summary: Add a test in testAggregations to check that Aggregate::extractAccumulators and Aggregate::extractValues don't have side effects of changing the accumulators. min, max, min_by, and max_by currently have side effects. avg and sum for decimal inputs have undefined values in intermediate results. So they are excluded from the added testing. Reviewed By: mbasmanova Differential Revision: D52755566
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Reviewed By: mbasmanova Differential Revision: D52638334
Summary: Velox has an optimization for Window operation with incremental aggregation when there are peer rows with the same frame. In this situation, the aggregation result is only computed at the first row of the peer and the rest rows simply copy this result. This optimization assumes that results of incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n) in Velox breaks this assumption because their extractValues() method causes the accumulator to be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n) to make the extraction method not clear the accumulator. The behavior after the fix also align with Presto's. This diff also adds a method testIncrementalAggregation in testAggregations to check that extractValues() doesn't change accumulator afterwards for all aggregation functions. After this fix, only min_by/max_by(x, y, n) doesn't pass testIncrementalAggregation. This diff fixes facebookincubator#8138. Reviewed By: mbasmanova Differential Revision: D52638334
a708dc4
to
48f2bf1
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
48f2bf1
to
3e560d6
Compare
This pull request was exported from Phabricator. Differential Revision: D52638334 |
This pull request has been merged in d747ac3. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Summary: Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. Differential Revision: D53139892
Summary: Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Differential Revision: D53139892
Summary: Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Differential Revision: D53139892
Summary: Pull Request resolved: #8566 Same as bug in min/max(x, n) fixed in #8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes #8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
Summary: Pull Request resolved: facebookincubator#8566 Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
Summary: Pull Request resolved: facebookincubator#8566 Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
Summary: Pull Request resolved: facebookincubator#8566 Same as bug in min/max(x, n) fixed in facebookincubator#8311, min_by/max_by(x, y, n) also breaks the assumption of incremental window aggregation because their extractValues() methods has a side effect of clearing the accumulator. This diff fixes this issue by making the extractValues() methods of min_by/max_by(x, y, n) not clear the accumulators. Since Presto's min_by/max_by have the same bug (prestodb/presto#21653). This fix will make Velox's min_by/max_by behave differently from Presto when used in Window operation, until prestodb/presto#21653 is fixed. This diff fixes facebookincubator#8138. Reviewed By: bikramSingh91 Differential Revision: D53139892 fbshipit-source-id: 1323f22196e22554c0d880d20584a4ee4059b64c
Summary:
Velox has an optimization for Window operation with incremental aggregation when there are peer
rows with the same frame. In this situation, the aggregation result is only computed at the first row
of the peer and the rest rows simply copy this result. This optimization assumes that results of
incremental aggregation in Window operation on peer rows should be the same. However, min/max(x, n)
in Velox breaks this assumption because their extractValues() method causes the accumulator to
be cleared, making the peer rows after the first row expect a different result. This diff fixes min/max(x, n)
to make the extraction method not clear the accumulator. The behavior after the fix also align with
Presto's.
This diff fixes #8138.
Differential Revision: D52638334