Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18742][CORE]Clarify that user-defined BroadcastFactory is not supported #16173

Closed
wants to merge 5 commits into from

Conversation

windpiger
Copy link
Contributor

@windpiger windpiger commented Dec 6, 2016

What changes were proposed in this pull request?

After SPARK-12588 Remove HTTPBroadcast [1], the one and only implementation of BroadcastFactory is TorrentBroadcastFactory and the spark.broadcast.factory conf has removed.

however the scaladoc says [2]:

/**

  • An interface for all the broadcast implementations in Spark (to allow
  • multiple broadcast implementations). SparkContext uses a user-specified
  • BroadcastFactory implementation to instantiate a particular broadcast for the
  • entire Spark job.
    */

so we should modify the comment that SparkContext will not use a user-specified BroadcastFactory implementation

[1] https://issues.apache.org/jira/browse/SPARK-12588
[2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/BroadcastFactory.scala#L25-L30

How was this patch tested?

unit test added

@SparkQA
Copy link

SparkQA commented Dec 6, 2016

Test build #69732 has finished for PR 16173 at commit a17d325.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2016

Test build #69733 has finished for PR 16173 at commit 1b4d4b6.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@windpiger windpiger changed the title [SPARK-18742][CORE]readd spark.broadcast.factory conf to implement er-defined BroadcastFactory [SPARK-18742][CORE]readd spark.broadcast.factory conf to implement uer-defined BroadcastFactory Dec 6, 2016
@SparkQA
Copy link

SparkQA commented Dec 6, 2016

Test build #69734 has finished for PR 16173 at commit f23becc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Dec 6, 2016

You certainly have a point there, but I also don't know that there's actually an intent to do the work to truly support and maintain an SPI for broadcasts.

@windpiger windpiger changed the title [SPARK-18742][CORE]readd spark.broadcast.factory conf to implement uer-defined BroadcastFactory [SPARK-18742][CORE]readd spark.broadcast.factory conf to implement user-defined BroadcastFactory Dec 7, 2016
@windpiger
Copy link
Contributor Author

since the trait BroadcastFactory is design for multiple implements, is't reasonable to provide a way to change the implement of BroadcastFactory? or only TorrentBroadcastFactory is enough

@srowen
Copy link
Member

srowen commented Dec 7, 2016

It existed to support several internal implementations, of which there is only one now. I don't know that it was ever intended for external implementations.

@windpiger
Copy link
Contributor Author

ok, the BroadcastFactory' comment shows SparkContext uses a user-specified BroadcastFactory implementation to instantiate a particular broadcast for the entire Spark job. so I think it is designed for external implementations.

@srowen
Copy link
Member

srowen commented Dec 7, 2016

Hm, that's a good point. I don't recall ever hearing much about someone actually plugging in an implementation. It's a modest change you're proposing and not unreasonable, I just don't know how much this API is actually robust right now, or whether it really should be.

@windpiger
Copy link
Contributor Author

when HttpBroadCastFactory & TorrentBroadcastFactory both exists, it is use spark.broadcast.factory to change the implement.

it is better to modify the comment of BroadcastFactory than readd the spark.broadcast.factory back?

@windpiger
Copy link
Contributor Author

windpiger commented Dec 8, 2016

@srowen

@srowen
Copy link
Member

srowen commented Dec 8, 2016

Yeah those were Spark classes. This property doesn't seem to be used now.
It's possible to restore this but I don't know if it's intended now. Yes I suppose you could update the comment instead, but it doesn't seem like a big deal.

@windpiger
Copy link
Contributor Author

ok,someone else can tell if it is resonable to readd the conf now?

@windpiger
Copy link
Contributor Author

cc @rxin could you help to give some advise? thanks!

@rxin
Copy link
Contributor

rxin commented Dec 14, 2016

Is there actually an implementation you want to do? I'd push back against opening this up unless we have a good reason to (e.g. more scalable broadcast).

@windpiger
Copy link
Contributor Author

if the trait BroadcastFactory is designed for user not for internal, it is better to add this conf eventhough there is no a more scalable broadcast. Or we can modify the comment https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/BroadcastFactory.scala#L25-L30

@srowen
Copy link
Member

srowen commented Dec 14, 2016

I wouldn't open it up just to open it up. If there were a pretty good use case for external implementations then sure. Otherwise, at best, change the comment, yeah.

@windpiger
Copy link
Contributor Author

ok,just modify the comment, thanks!

@SparkQA
Copy link

SparkQA commented Dec 15, 2016

Test build #70168 has finished for PR 16173 at commit 49d799f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 15, 2016

Test build #3499 has finished for PR 16173 at commit 49d799f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@windpiger
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 16, 2016

Test build #70227 has finished for PR 16173 at commit 49d799f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Dec 16, 2016

@windpiger you need to update the title/description of this and the JIRA too.

@windpiger
Copy link
Contributor Author

ok,thanks

@windpiger windpiger changed the title [SPARK-18742][CORE]readd spark.broadcast.factory conf to implement user-defined BroadcastFactory [SPARK-18742][CORE]Clarify that user-defined BroadcastFactory is not supported Dec 16, 2016
@asfgit asfgit closed this in 53ab8fb Dec 16, 2016
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
… supported

## What changes were proposed in this pull request?
After SPARK-12588 Remove HTTPBroadcast [1], the one and only implementation of BroadcastFactory is TorrentBroadcastFactory and the spark.broadcast.factory conf has removed.

however the scaladoc says [2]:

/**
 * An interface for all the broadcast implementations in Spark (to allow
 * multiple broadcast implementations). SparkContext uses a user-specified
 * BroadcastFactory implementation to instantiate a particular broadcast for the
 * entire Spark job.
 */

so we should modify the comment that SparkContext will not use a  user-specified BroadcastFactory implementation

[1] https://issues.apache.org/jira/browse/SPARK-12588
[2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/BroadcastFactory.scala#L25-L30

## How was this patch tested?
unit test added

Author: root <root@iZbp1gsnrlfzjxh82cz80vZ.(none)>
Author: windpiger <songjun@outlook.com>

Closes apache#16173 from windpiger/addBroadFactoryConf.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants