[SPARK-30033][core] Manage shuffle IO plugins using Spark's plugin system. #26670

vanzin · 2019-11-26T00:32:18Z

SPARK-25299 is introducing a new plugin interface for shuffle IO; currently,
parts of that API provide lifecycle methods that are already covered by the
plugin API that was added in SPARK-29396.

This change makes some modifications so that:

The driver and executor components of the shuffle plugin extend their
respective counterparts in the generic plugin API.
The shuffle IO plugin is managed by the same code that manages other
generic plugins.

This simplifies and reuses similar code that exists in both implementations,
and also provides more functionality to shuffle plugins: not only do they have
more contextual information (without having to query APIs like SparkEnv) but
they also have access to other functionality in the plugin API that would
otherwise require touching internal Spark APIs.

There is a small change to the generic plugin API to avoid registering an
RPC endpoint and starting threads when not needed; plugins now must explicitly
say they want to handle RPC messages for the endpoint to be created. This is
done because the default shuffle plugin is now loaded by the plugin system,
and does not need the RPC functionality. (This API hasn't been released yet
so it's ok to make the change.)

The only downside is that initialization of the SortShuffleManager in executors
is a bit weird, because of the order in which things are initialized: the
shuffle manager is initialized by SparkEnv, and plugin initialization happens
after that. In any case, all initialization is done before any tasks are
allowed to run..

Currently, the shuffle plugin is always loaded, regardless of whether the sort
shuffle manager is being used; this was already the case in the driver, but
now is also the case in the executors. It shouldn't be hard to fix that if
needed.

Tested with existing and updated unit tests.

…stem. SPARK-25299 is introducing a new plugin interface for shuffle IO; currently, parts of that API provide lifecycle methods that are already covered by the plugin API that was added in SPARK-29396. This change makes some modifications so that: - The driver and executor components of the shuffle plugin extend their respective counterparts in the generic plugin API. - The shuffle IO plugin is managed by the same code that manages other generic plugins. This simplifies and reuses similar code that exists in both implementations, and also provides more functionality to shuffle plugins: not only do they have more contextual information (without having to query APIs like SparkEnv) but they also have access to other functionality in the plugin API that would otherwise require touching internal Spark APIs. There is a small change to the generic plugin API to avoid registering an RPC endpoint and starting threads when not needed; plugins now must explicitly say they want to handle RPC messages for the endpoint to be created. This is done because the default shuffle plugin is now loaded by the plugin system, and does not need the RPC functionality. (This API hasn't been released yet so it's ok to make the change.) The only downside is that initialization of the SortShuffleManager in executors is a bit weird, because of the order in which things are initialized: the shuffle manager is initialized by SparkEnv, and plugin initialization happens after that. In any case, all initialization is done before any tasks are allowed to run.. Currently, the shuffle plugin is always loaded, regardless of whether the sort shuffle manager is being used; this was already the case in the driver, but now is also the case in the executors. It shouldn't be hard to fix that if needed. Tested with existing and updated unit tests.

SparkQA · 2019-11-26T02:46:17Z

Test build #114434 has finished for PR 26670 at commit 3687d8f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

core/src/main/java/org/apache/spark/api/plugin/PluginRpcEndpoint.java

dongjoon-hyun · 2019-11-26T04:20:37Z

It's surprising to me that the PR Builder ignores lint-java, but we have GitHub Action for that.

SparkQA · 2019-11-26T19:38:53Z

Test build #114475 has finished for PR 26670 at commit 902ee39.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-03T19:44:39Z

Test build #116100 has finished for PR 26670 at commit 9d87dd9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2020-01-24T18:49:30Z

The only downside is that initialization of the SortShuffleManager in executors
is a bit weird, because of the order in which things are initialized: the
shuffle manager is initialized by SparkEnv, and plugin initialization happens
after that. In any case, all initialization is done before any tasks are
allowed to run..

I haven't looked at the code yet, what exactly do you mean initialized by SparkEnv? Do you just mean that sparkEnv instantiates the shuffle manager class so its created before you possibly would want to initialize something in the plugin? within the executor plugin it seems like having other plugins initialize first would make sense in case shuffle relied on another plugin initialization, but I supposed the reverse could be true as well.

SparkQA · 2020-01-24T19:14:50Z

Test build #117361 has finished for PR 26670 at commit 6bba5ef.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2020-01-28T01:19:29Z

I haven't looked at the code yet, what exactly do you mean initialized by SparkEnv?

The shuffle manager is initialized by SparkEnv (look for shuffleManager in the source), and the plugin context needs to be initialized after SparkEnv, currently. Kind of a chicken-and-egg thing.

Some more verbose initialization where the plugin context has multiple "init" methods (for pre- and post- SparkEnv, for example) could solve it, but then the internal state of the plugin context needs to be mutable.

Anyway, didn't look too much into making that initialization cleaner.

github-actions · 2020-05-08T00:11:02Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

dongjoon-hyun reviewed Nov 26, 2019

View reviewed changes

core/src/main/java/org/apache/spark/api/plugin/PluginRpcEndpoint.java Outdated Show resolved Hide resolved

dongjoon-hyun added the SPARK CORE label Nov 26, 2019

Import cleanup.

902ee39

Merge branch 'master' into SPARK-30033

9d87dd9

Merge branch 'master' into SPARK-30033

6bba5ef

tgravescs mentioned this pull request Jan 27, 2020

[SPARK-30638][CORE] Add resources allocated to PluginContext #27367

Closed

vanzin mentioned this pull request Feb 4, 2020

WIP: Output Tracking API vanzin/spark#53

Closed

github-actions bot added the Stale label May 8, 2020

github-actions bot closed this May 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-30033][core] Manage shuffle IO plugins using Spark's plugin system. #26670

[SPARK-30033][core] Manage shuffle IO plugins using Spark's plugin system. #26670

vanzin commented Nov 26, 2019

SparkQA commented Nov 26, 2019

dongjoon-hyun commented Nov 26, 2019

SparkQA commented Nov 26, 2019

SparkQA commented Jan 3, 2020

tgravescs commented Jan 24, 2020

SparkQA commented Jan 24, 2020

vanzin commented Jan 28, 2020

github-actions bot commented May 8, 2020

[SPARK-30033][core] Manage shuffle IO plugins using Spark's plugin system. #26670

[SPARK-30033][core] Manage shuffle IO plugins using Spark's plugin system. #26670

Conversation

vanzin commented Nov 26, 2019

SparkQA commented Nov 26, 2019

dongjoon-hyun commented Nov 26, 2019

SparkQA commented Nov 26, 2019

SparkQA commented Jan 3, 2020

tgravescs commented Jan 24, 2020

SparkQA commented Jan 24, 2020

vanzin commented Jan 28, 2020

github-actions bot commented May 8, 2020