Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18711][SQL] should disable subexpression elimination for LambdaVariable #16143

Closed
wants to merge 1 commit into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is kind of a long-standing bug, it's hidden until #15780 , which may add AssertNotNull on top of LambdaVariable and thus enables subexpression elimination.

However, subexpression elimination will evaluate the common expressions at the beginning, which is invalid for LambdaVariable. LambdaVariable usually represents loop variable, which can't be evaluated ahead of the loop.

This PR skips expressions containing LambdaVariable when doing subexpression elimination.

How was this patch tested?

updated test in DatasetAggregatorSuite

@cloud-fan
Copy link
Contributor Author

cc @rxin @yhuai @hvanhovell

@SparkQA
Copy link

SparkQA commented Dec 5, 2016

Test build #69665 has finished for PR 16143 at commit 3c1f20d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

LGTM - merging to master/2.1. Thanks!

asfgit pushed a commit that referenced this pull request Dec 5, 2016
…aVariable

## What changes were proposed in this pull request?

This is kind of a long-standing bug, it's hidden until #15780 , which may add `AssertNotNull` on top of `LambdaVariable` and thus enables subexpression elimination.

However, subexpression elimination will evaluate the common expressions at the beginning, which is invalid for `LambdaVariable`. `LambdaVariable` usually represents loop variable, which can't be evaluated ahead of the loop.

This PR skips expressions containing `LambdaVariable` when doing subexpression elimination.

## How was this patch tested?

updated test in `DatasetAggregatorSuite`

Author: Wenchen Fan <wenchen@databricks.com>

Closes #16143 from cloud-fan/aggregator.

(cherry picked from commit 01a7d33)
Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
@asfgit asfgit closed this in 01a7d33 Dec 5, 2016
@koertkuipers
Copy link
Contributor

thanks for getting this fixed so fast

robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 15, 2016
…aVariable

## What changes were proposed in this pull request?

This is kind of a long-standing bug, it's hidden until apache#15780 , which may add `AssertNotNull` on top of `LambdaVariable` and thus enables subexpression elimination.

However, subexpression elimination will evaluate the common expressions at the beginning, which is invalid for `LambdaVariable`. `LambdaVariable` usually represents loop variable, which can't be evaluated ahead of the loop.

This PR skips expressions containing `LambdaVariable` when doing subexpression elimination.

## How was this patch tested?

updated test in `DatasetAggregatorSuite`

Author: Wenchen Fan <wenchen@databricks.com>

Closes apache#16143 from cloud-fan/aggregator.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…aVariable

## What changes were proposed in this pull request?

This is kind of a long-standing bug, it's hidden until apache#15780 , which may add `AssertNotNull` on top of `LambdaVariable` and thus enables subexpression elimination.

However, subexpression elimination will evaluate the common expressions at the beginning, which is invalid for `LambdaVariable`. `LambdaVariable` usually represents loop variable, which can't be evaluated ahead of the loop.

This PR skips expressions containing `LambdaVariable` when doing subexpression elimination.

## How was this patch tested?

updated test in `DatasetAggregatorSuite`

Author: Wenchen Fan <wenchen@databricks.com>

Closes apache#16143 from cloud-fan/aggregator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants