[SPARK-26957][SCHEDULER] Config properties for spark dynamic scheduler pools #23865

DaveDeCaprio · 2019-02-21T22:45:12Z

What changes were proposed in this pull request?

This fix addresses the issue raised in SPARK-26957. If a spark.scheduler.pool is set to a name that was not a preconfigured pool, the pool was always created with the default properties. This fix allows the default scheduler pool properties to be configured through configuration parameters. Previously they were hardcoded.

The fix is fully backwards compatible because the configuration properties default to the existing hardcoded values. This PR just allows those values to be updated. The specific use case where we needed this was explained in SPARK-26957

How was this patch tested?

A unit test was added to PoolSuite that sets the config values, generates a pool, and checks that the pool is using the configured values. An existing test verifies that with no special configuration settings, the original defaults are used.

In addition to this unit testing, we are currently using a version of this patch in production on AWS EMR for Spark 2.4.0.

This contribution is my original work and that I license the work to the project under the project’s open source license.

merge in spark

beliefer · 2019-02-22T04:20:59Z

core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala

-  val DEFAULT_SCHEDULING_MODE = SchedulingMode.FIFO
-  val DEFAULT_MINIMUM_SHARE = 0
-  val DEFAULT_WEIGHT = 1
+  val defaultSchedulingMode = SchedulingMode.withName(conf.get(SCHEDULER_DEFAULT_SCHEDULING_MODE))


This change looks like good.
If user open the spark fair scheduler,the default pool always use fifo mode.

srowen · 2019-02-22T14:34:22Z

I don't know much about this part, but if you want a non-default pool, wouldn't you configure your own pool? should the defaults be configurable rather than just a default?

DaveDeCaprio · 2019-02-22T16:58:27Z

The only way to configure your own pools is in an XML file loaded on startup. In our case we'd like to have the system be more dynamic, rather than having to hardcode them in config files. Spark already supports creating pools dynamically like this, but doesn't support configuring them with anything but the default parameters.

I cover the specific details on why we need this in the Jira ticket. Basically, we want a low priority pool and a set of dynamic pools for different projects that run at a higher priority. There is no way to do that with the current setup.

AmplabJenkins · 2019-09-16T18:15:45Z

Can one of the admins verify this patch?

DaveDeCaprio · 2019-12-17T17:26:12Z

Is there a way to move this forward?

github-actions · 2020-03-27T00:10:34Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

DaveDeCaprio added 7 commits November 27, 2018 13:18

Merge pull request #1 from apache/master

22bd4bd

merge in spark

Merge branch 'master' of https://github.com/apache/spark

9219c01

Merge branch 'master' of https://github.com/apache/spark

771d12d

Merge branch 'master' of https://github.com/apache/spark

0277891

Merge branch 'master' of https://github.com/apache/spark

0971a06

Allow for configuration of spark pools through config properties.

5b77f4d

Updated config names to make it clear it was about the pools.

6f2bf0b

beliefer approved these changes Feb 22, 2019

View reviewed changes

dongjoon-hyun added the SCHEDULER label Jun 14, 2019

github-actions bot added the Stale label Mar 27, 2020

Merge branch 'master' into config-spark-pools

3021d61

github-actions bot closed this Mar 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-26957][SCHEDULER] Config properties for spark dynamic scheduler pools #23865

[SPARK-26957][SCHEDULER] Config properties for spark dynamic scheduler pools #23865

Uh oh!

DaveDeCaprio commented Feb 21, 2019

Uh oh!

beliefer Feb 22, 2019 •

edited

Loading

Uh oh!

srowen commented Feb 22, 2019

Uh oh!

DaveDeCaprio commented Feb 22, 2019

Uh oh!

AmplabJenkins commented Sep 16, 2019

Uh oh!

DaveDeCaprio commented Dec 17, 2019

Uh oh!

github-actions bot commented Mar 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-26957][SCHEDULER] Config properties for spark dynamic scheduler pools #23865

[SPARK-26957][SCHEDULER] Config properties for spark dynamic scheduler pools #23865

Uh oh!

Conversation

DaveDeCaprio commented Feb 21, 2019

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

beliefer Feb 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srowen commented Feb 22, 2019

Uh oh!

DaveDeCaprio commented Feb 22, 2019

Uh oh!

AmplabJenkins commented Sep 16, 2019

Uh oh!

DaveDeCaprio commented Dec 17, 2019

Uh oh!

github-actions bot commented Mar 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

beliefer Feb 22, 2019 •

edited

Loading