Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34222][SQL][FOLLOWUP] Non-recursive implementation of buildBalancedPredicate #31724

Closed
wants to merge 1 commit into from

Conversation

gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Mar 3, 2021

What changes were proposed in this pull request?

Use a non-recursive implementation for the function buildBalancedPredicate

Why are the changes needed?

For better performance.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests.
Also, a quick benchmark:

  test("buildBalancedPredicate") {
    val expressions = (1 to 1000).map(_ => Literal(true))
    val start = System.currentTimeMillis()
    buildBalancedPredicate(expressions, And)
    println(System.currentTimeMillis() - start)
  }

Before: 47ms
After: 4ms

@gengliangwang
Copy link
Member Author

cc @Swinky @cloud-fan @maropu

@Swinky
Copy link
Contributor

Swinky commented Mar 3, 2021

for N predicates, number of recursive calls on stack are logN, (with base 2) for buildBalancedPredicate.

If we have such huge predicates then it would fail with stackoverflow in splitDisjunctivePredicates and splitConjunctivePredicates before this ?

@SparkQA
Copy link

SparkQA commented Mar 3, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40288/

@SparkQA
Copy link

SparkQA commented Mar 3, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40288/

@gengliangwang
Copy link
Member Author

@Swinky I have updated the description. It's for better performance.

@maropu maropu changed the title [SPARK-34222][SQL][FollowUp] Non-recursive implementation of buildBalancedPredicate [SPARK-34222][SQL][FOLLOWUP] Non-recursive implementation of buildBalancedPredicate Mar 3, 2021
@github-actions github-actions bot added the SQL label Mar 3, 2021
@Swinky
Copy link
Contributor

Swinky commented Mar 3, 2021

Thanks @gengliangwang for this !

@SparkQA
Copy link

SparkQA commented Mar 3, 2021

Test build #135706 has finished for PR 31724 at commit d0e2d74.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

@cloud-fan @maropu @Swinky Thanks for the review.
Merging to master

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants