-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13092][SQL] Add ExpressionSet for constraint tracking #11338
Conversation
/cc @sameeragarwal @yhuai |
test this please |
Test build #51829 has finished for PR 11338 at commit
|
assert((initialSet + (aUpper + 1)).size == 1) | ||
assert((initialSet + (aUpper + 2)).size == 2) | ||
assert((initialSet - (aUpper + 1)).size == 0) | ||
assert((initialSet - (aUpper + 2)).size == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also test (initialSet + (aLower + 1)).size
and (initialSet - (aLower + 1)).size
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
verifyCallCount( | ||
df.selectExpr("testUdf(a + 1) + testUdf(1 + a)", "testUdf(a + 1)"), Row(4, 2), 2) | ||
df.selectExpr("testUdf(a + 1) + testUdf(1 + a)", "testUdf(a + 1)"), Row(4, 2), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @nongli
test this please |
2 similar comments
test this please |
test this please |
LGTM |
test this please |
Test build #51909 has finished for PR 11338 at commit
|
Test build #2581 has finished for PR 11338 at commit
|
test this please |
Test build #51917 has finished for PR 11338 at commit
|
Thanks. Merging to master. |
## What changes were proposed in this pull request? This PR is a small follow up on #11338 (https://issues.apache.org/jira/browse/SPARK-13092) to use `ExpressionSet` as part of the verification logic in `ConstraintPropagationSuite`. ## How was this patch tested? No new tests added. Just changes the verification logic in `ConstraintPropagationSuite`. Author: Sameer Agarwal <sameer@databricks.com> Closes #11611 from sameeragarwal/expression-set.
## What changes were proposed in this pull request? This PR is a small follow up on apache#11338 (https://issues.apache.org/jira/browse/SPARK-13092) to use `ExpressionSet` as part of the verification logic in `ConstraintPropagationSuite`. ## How was this patch tested? No new tests added. Just changes the verification logic in `ConstraintPropagationSuite`. Author: Sameer Agarwal <sameer@databricks.com> Closes apache#11611 from sameeragarwal/expression-set.
This PR adds a new abstraction called an
ExpressionSet
which attempts to canonicalize expressions to remove cosmetic differences. Deterministic expressions that are in the set after canonicalization will always return the same answer given the same input (i.e. false positives should not be possible). However, it is possible that two canonical expressions that are not equal will in fact return the same answer given any input (i.e. false negatives are possible).Other relevant changes include:
semanticEquals
andsemanticHash
, those functions are also ported to this new infrastructure.canonicalized
version of the expression is added as alazy val
toExpression
and is used by bothsemanticEquals
andExpressionSet
.ExpressionSet
are addedsemanticEquals
to be less intelligent than it now is are updated.As a followup, we should consider auditing the places where we do
O(n)
semanticEquals
operations and replace them withExpressionSet
. We should also consider consolidatingAttributeSet
as a specialized factory for anExpressionSet.