Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17396][core] Share the task support between UnionRDD instances. #14985

Closed
wants to merge 1 commit into from

Conversation

rdblue
Copy link
Contributor

@rdblue rdblue commented Sep 6, 2016

What changes were proposed in this pull request?

Share the ForkJoinTaskSupport between UnionRDD instances to avoid creating a huge number of threads if lots of RDDs are created at the same time.

How was this patch tested?

This uses existing UnionRDD tests.

This removes the private task support in each UnionRDD and replaces it
with a shared pool on the UnionRDD object.
@SparkQA
Copy link

SparkQA commented Sep 7, 2016

Test build #65012 has finished for PR 14985 at commit 89065fd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 7, 2016

Looks good to me as a minimal change. It's not hard to use the Java executors here too if that would be preferable but would require more rewrite and some more code.

@SparkQA
Copy link

SparkQA commented Sep 7, 2016

Test build #3252 has finished for PR 14985 at commit 89065fd.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 9, 2016

Jenkins retest this please

@srowen
Copy link
Member

srowen commented Sep 9, 2016

Just retesting to see what that 'does not merge cleanly' is about.
If there are no other comments I'll merge, as this is a reasonably important fix.

@JoshRosen
Copy link
Contributor

LGTM as well. This seems like a reasonable fix.

@rdblue
Copy link
Contributor Author

rdblue commented Sep 9, 2016

I can also rebase on master. I'm surprised it isn't merging cleanly, it's a tiny patch to code that hardly ever changes.

@srowen
Copy link
Member

srowen commented Sep 9, 2016

I think it merges fine, because github says so. I don't know why Jenkins reported that it didn't merge but that it succeeded.

@SparkQA
Copy link

SparkQA commented Sep 9, 2016

Test build #65160 has finished for PR 14985 at commit 89065fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Sep 10, 2016
## What changes were proposed in this pull request?

Share the ForkJoinTaskSupport between UnionRDD instances to avoid creating a huge number of threads if lots of RDDs are created at the same time.

## How was this patch tested?

This uses existing UnionRDD tests.

Author: Ryan Blue <blue@apache.org>

Closes #14985 from rdblue/SPARK-17396-use-shared-pool.

(cherry picked from commit 6ea5055)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@srowen
Copy link
Member

srowen commented Sep 10, 2016

Merged to master/2.0

@asfgit asfgit closed this in 6ea5055 Sep 10, 2016
zzcclp added a commit to zzcclp/spark that referenced this pull request Sep 13, 2016
wgtmac pushed a commit to wgtmac/spark that referenced this pull request Sep 19, 2016
## What changes were proposed in this pull request?

Share the ForkJoinTaskSupport between UnionRDD instances to avoid creating a huge number of threads if lots of RDDs are created at the same time.

## How was this patch tested?

This uses existing UnionRDD tests.

Author: Ryan Blue <blue@apache.org>

Closes apache#14985 from rdblue/SPARK-17396-use-shared-pool.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants